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Abstract 

, This paper derives an improved sphere-packing (ISP) bound for finite-length codes whose transmission takes place over 
" symmetric memoryless channels. We first review classical results, i.e., the 1959 sphere-packing (SP59) bound of Shannon for 
the Gaussian channel, and the 1967 sphere-packing (SP67) bound of Shannon et al. for discrete memoryless channels. A recent 
improvement on the SP67 bound, as suggested by Valembois and Fossorier, is also discussed. These concepts are used for the 
derivation of a new lower bound on the decoding error probability (referred to as the ISP bound) which is uniformly tighter than 
the SP67 bound and its recent improved version. The ISP bound is applicable to symmetric memoryless channels, and some of 
^5 , its applications are exemplified. Its tightness is studied by comparing it with bounds on the ML decoding error probability, and 
£ — ' computer simulations of iteratively decoded turbo-like codes. The paper also presents a technique which performs the entire 
t-H , calculation of the SP59 bound in the logarithmic domain, thus facilitating the exact calculation of this bound for moderate to 
■ large block lengths without the need for the asymptotic approximations provided by Shannon. 
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q , Block codes, error exponent, list decoding, sphere-packing bound, turbo-like codes. 

' I. Introduction 

> '. 

One of Shannon's favorite research topics was the theoretical study of the fundamental performance limits of 
^ " long block codes. During the fifties and sixties, this research work attracted Shannon and his colleagues at MIT 
qq and Bell Labs; the contributions which came out of this work were published by Shannon et al. (see, e.g., the 
O ! collected papers of Shannon in [26] and the book of Gallager [12]). An overview of these results and their impact 
^ ' was addressed by Berlekamp [2]. 

The introduction of turbo-like codes, which closely approach the Shannon capacity limit with moderate block 
q | lengths and feasible decoding complexity, stirred up new interest in studying the limits of code performance as a 
^ ; function of the block length (see, e.g., [9], [14], [15], [17], [23], [29], [35], [37]). 

Following this direction of research, this paper is aimed to contribute to the study of the fundamental performance 
limitations of finite-length codes whose transmission takes place over an arbitrary symmetric memoryless channel, 
' and also to study the fundamental tradeoff between the performance and block length of these codes. This study is 
facilitated by theoretical bounds, and is also compared with practical results which are obtained by modern coding 
techniques and sub-optimal decoding algorithms. In this respect, the reader is referred to a recent and comprehensive 
tutorial paper by Costello and Forney [3] which traces the evolution of channel coding techniques, and also addresses 
the significant contribution of error-correcting codes in improving the tradeoff between performance, block length 
(delay) and complexity for practical applications. 

The 1959 sphere-packing (SP59) bound of Shannon [24] serves for the evaluation of the performance limits 
of block codes whose transmission takes place over an AWGN channel. This lower bound on the decoding error 
probability is expressed in terms of the block length and rate of the code; however, it does not take into account 
the modulation used, but only assumes that the signals are of equal energy. It is often used as a reference for 
quantifying the sub-optimality of error-correcting codes associated with their decoding algorithms. 

The paper was submitted to the IEEE Trans, on Information Theory in March 2007. This work was presented in part at the 44th Annual 
Allerton Conference on Communication, Control and Computing, Monticello, Illinois, USA, September 2006, and the 2006 IEEE 24th 
Convention of Electrical and Electronics Engineers in Israel, Eilat, Israel, November 2006. 

The authors are with the Department of Electrical Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel (e-mails: 
{igillw@tx, sason@ee}.technion.ac.il). Igal Sason is the corresponding author for this paper. 
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The 1967 sphere -packing (SP67) bound, derived by Shannon, Gallager and Berlekamp [25], provides a lower 
bound on the decoding error probability of block codes as a function of their block length and code rate, and 
it applies to arbitrary discrete memoryless channels. Like the random coding bound of Gallager [11], the SP67 
bound decays to zero exponentially with the block length for all rates below the channel capacity. Further, the error 
exponent of the SP67 bound is tight at the portion of the rate region between the critical rate (R c ) and the channel 
capacity; for all rates in this range, the error exponents of the SP67 and the random coding bounds coincide (see 
[25, Part 1]). 

In spite of its exponential behavior, the SP67 bound appears to be loose for codes of small to moderate block 
lengths. This weakness is due to the original focus in [25] on asymptotic analysis. In their paper [35], Valembois 
and Fossorier revisited the SP67 bound in order to improve its tightness for finite-length block codes (especially, for 
codes of short to moderate block lengths), and also extended its validity to memoryless continuous-output channels 
(e.g., the binary-input AWGN channel). The remarkable improvement of their bound over the classical SP67 bound 
was exemplified in [35]; moreover, it provides an interesting alternative to the SP59 bound which is particularized 
for the AWGN channel [24]. 

In this work, we derive an improved sphere-packing bound (referred to as the ISP bound) which further enhances 
the tightness of the bounding technique in [25], especially for codes of short to moderate block lengths; this new 
bound is valid for all symmetric memoryless channels. 

The paper is structured as follows: Section [TT] reviews the concepts used in the derivation of the SP67 bound 
[25, Part 1], and its recent improvements in [35] which are especially effective for codes of short to moderate 
block lengths. In Section [Till we derive the ISP bound which further enhances the tightness of the bound in [35] for 
symmetric memoryless channels; the derivation of this bound relies on concepts and notation presented in Section UT1 
Section ITVl starts by reviewing the SP59 bound of Shannon [24], and presenting the numerical algorithm used in 
[35] for calculating this bound. The numerical instability of this algorithm for codes of moderate to large block 
lengths motivates the derivation of an alternative algorithm in Section [TV] which facilitates the exact calculation of 
the SP59 bound, irrespectively of the block length. Section [V] provides numerical results which serve to compare 
the tightness of the ISP bound, derived in Section [III] with the SP59 bound of Shannon [24] and the recent sphere- 
packing bound in [35]. The tightness of the ISP bound is exemplified in Section [V] for M-ary phase-shift-keying 
(PSK) block coded modulation schemes whose transmission takes place over the AWGN channel, and also for the 
binary erasure channel (BEC). Additionally, Section [V] applies the sphere-packing bounds to give lower bounds on 
the block length required to achieve a required performance on a given channel. These lower bounds are compared 
with the performance of some practically decodable codes which are presented in recent works. We conclude our 
discussion in Section [VTJ Technical calculations are relegated to the appendices. 

II. The 1967 Sphere-Packing Bound and Improvements 

In this section, we outline the derivation of the SP67 bound. We then survey the improvements to this bound, as 
suggested in [35], which also extend the validity of the improved bound to memoryless discrete-input continuous- 
output channels. This review serves as a preparatory stage for presenting an improved sphere -packing bound in the 
next section; the new bound further enhances the tightness of the sphere -packing bounding technique for finite- 
length codes whose transmission takes place over symmetric memoryless channels. For a comprehensive tutorial 
review of sphere -packing bounds, the reader is referred to [23, Chapter 5]. Due to the strong relevance of the 
sphere-packing bounds to the analysis in this paper, we note that the two "Information and Control" papers related 
to the SP67 bound [25] and the paper related to the SP59 bound [24] are also published in the book which consists 
of all the papers of Shannon [26]. 

A. The 1967 Sphere -Packing Bound 

Let us consider a block code C which consists of M codewords each of length N, and denote its codewords 
by xi, . . . ,Xm- Assume that C is transmitted over a discrete memoryless channel (DMC) and decoded by a list 
decoder, for each received sequence y, the decoder outputs a list of at most L integers belonging to the set 
{1,2, . . . , M} which correspond to the indices of the codewords. A list decoding error is declared if the index 
of the transmitted codeword does not appear in the list. List decoding, originally introduced by Elias [10] and 
Wozencraft [38], signifies an important class of decoding algorithms. In [25], the authors derive a lower bound on 
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the decoding error probability of an arbitrary block code with M codewords of length N; the bound applies to an 
arbitrary list decoder where the size of the list is limited to L. The particular case where L = 1 clearly provides a 
lower bound on the decoding error probability under maximum-likelihood (ML) decoding. 

Let y m denote the set of output sequences y for which message m is on the decoding list, and define P m (y) = 
Pr(y|x m ). The probability of list decoding error when message m is sent over the channel is given by 

p e , m = Yl p ^y) c 1 ) 

where the superscript 'c' stands for the complementary set. For the block code and list decoder under consideration, 
let -P e ,max designate the maximal value of P e m where m G {1,2, .. . ,M}. Assuming that all the codewords are 
equally likely to be transmitted, the average decoding error probability is given by 

1 M 

m=l 

Referring to a list decoder of size at most L, the code rate (in nats per channel use) is defined as R = — 4t^-. 

The derivation of the SP67 bound is divided in [25, Part 1] into three main steps. The first step refers to the 
derivation of upper and lower bounds on the error probability of a code consisting of two codewords only. These 
bounds are given by the following theorem: 

Theorem 2.1 (Upper and Lower Bounds on the Pairwise Error Probability): [25, Theorem 5]. Let Pi and P<i be 
two probability assignments defined over a discrete set of sequences, y± and 3^ = 3^ be (disjoint) decision regions 
for these sequences, P e i and P c ^ be given by £0), and assume that Pi(y)P2(y) ^ for at least one sequence y. 
Then, for all s G (0, 1) ' 

(2) 

or 



where 



P e> i > - exp^(s) - sfi'(s) - sy / 2^"(s)^j 

P e , 2 > ^exp( M (s) + (1 - s)fi'(s) - (1 - s)vV'( 8 )) (3) 

= ln(^P 1 (y) 1 - s P 2 (y) s ) < s < 1. (4) 
y 

Furthermore, for an appropriate choice of the decision regions y± and 3 I 2> the following upper bounds hold: 

Pe,i <exp(//( S )- s/z'(s)) (5) 

and 

Pe,2<exp(/i(s) + (l-ay(s)). (6) 

The function // is non-positive and convex over the interval (0, 1). The convexity of fj, is strict unless p*|yj is 
constant over all the sequences y for which Pi(y)P2(y) ^ 0. Moreover, the function \i is strictly negative over the 
interval (0, 1) unless Pi(y) = P2(y) for all y. 

Proof: A full proof of Theorem 12. II is given in [25, Section III]. In the following, we present a brief outline 
of the proof which serves to emphasize the parallelism between Theorem 12.11 and the first part of the derivation of 
the ISP bound in Section |nl] To this end, let us define the log-likelihood ratio (LLR) as 

D(Y) = In ( !M) ( 7 ) 



and the probability distribution 



Q (y) — 0< S <1 (8) 
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It is simple to show (see [25]) that for all < s < 1, the first and second derivatives of fx in (@]) are equal to the 
statistical expectation and variance of the LLR, respectively, taken with respect to (w.r.t.) the probability distribution 
Q s in ((8l). This gives the following equalities: 

//(«) = E .(£>(y)) (9) 

y!'{s) = Var Qs ( J D(y)) (10) 

Pi(y) = exp( M (s) - Sj D(y)) Q s (y) (11) 

P 2 (y) = exp( M ( S ) + (1 - s)L>(y)) Q s (y) . (12) 

where equalities (fTTT ) and (fT2l follow easily from (01), (O and ([8]). For every < s < 1, we further define the set 
of sequences 

y. = {yey:\D(y)-fj!{8)\< ^/W{s)} ■ (13) 

For any choice of a decision region 34 > the conditional error probability given that the first message was transmitted 
satisfies 

> E p i(y) 

E ex P ( M (s)- Sj D(y)) Q s (y) 



(a) 



(6) 



> exp ( M («) - s/i '( a ) - s vV'(*)) E G.(y) ( 14 ) 

where (a) follows from (fTTT) and (b) relies on the definition of ^ s in (fT3"T ). Using similar arguments and relying on 
(fT2l . we also get that 



Pe, 2 > ex P (^( s ) + (i - s) M '( fl ) - (i - s) vvw) E G»(y)- ^ 15) 

Since 3^1 and 3^2 form a partition of the observation space, we have that 

E Q°(y)+ E Q.(y) = E G-fr) > 5 

where the last transition relies on ((9]) and (fTOl) and follows from Chebychev's inequality. Therefore, at least one 
of the two sums on the LHS of the expression above must be greater than \. Substituting this in (fT4b and ([TBI ) 
completes the proof of the lower bound on the error probability in © and (f3]). The upper bound on the error 
probability in ([5j and © is attained by selecting the decision region for the first codeword to be 

y 1 ^{yey:D(y)< fJ ,'(s)} 

and the decision region for the second code as 3*2 — 3^1 • The proof for the upper bounds in in ([5J and © follows 
directly from (fTTb . (fT2l and the particular choice of J^i and 3^2 above. ■ 

The initial motivation given for Theorem 12.11 is the calculation of lower bounds on the error probability of a 
two-word code. However, it is valid for any pair of probability assignments P\ and P2 and decision regions 3 7 i 
and 3*2 which form a partition of the observation space. 

In the continuation of the derivation of the SP67 bound in [25], this theorem is used in order to control the 
size of a decision region of a particular codeword without directly referring to the other codewords. To this end, 
an arbitrary probability tilting measure /jv is introduced in [25] over all iV-length sequences of channel outputs, 
requiring that it is factorized in the form 

N 



/iv(y) = [J f(v*) < 16 ) 



n=l 
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for an arbitrary output sequence y = (yi, . . . , ypf). The size of the set y m is defined as 

F(y m ) ^ Yl My)- ^ 

yey m 

Next, [25] relies on Theorem 12.11 in order to relate the conditional error probability P e>m and F(y m ) for fixed 
composition codes; this is done by associating Pr(-|x m ) and /at with Pi and P2, respectively. Theorem l2.1l is applied 
to derive a parametric lower bound on the size of the decision region y m or on the conditional error probability 

P e , m . D ue to the fact that the list size is limited to L, then y G y m for at most L indices m G {1, . . . , M} and 

M 

hence PCVm) < L. Therefore, there exists an index m so that F(y m ) < jj and for this unknown value of m, 

m=l 

one can upper bound the conditional error probability P e m by 

Femax max P em . 

me{l,...,M} 

Using Theorem 12.11 this provides a lower bound on P e ,max- Next, the probability assignment / = f s is optimized 
in [25], so as to get the tightest (i.e., maximal) lower bound within this form while considering a code whose 
composition minimizes the bound (so that the bound holds for all fixed composition codes). A solution for this 
min-max problem, as provided in [25, Eq. 4.18-4.20], leads to the following theorem which gives a lower bound 
on the maximal block error probability of an arbitrary fixed composition block code (for a more detailed review 
of these concepts, see [23, Section 5.3]). 

Theorem 2.2 (Sphere-Packing Bound on the Maximal Decoding Error Probability for Fixed Composition Codes): 
[25, Theorem 6]. Let C be a fixed composition code of M codewords and block length N. Assume that the 
transmission of C takes place over a DMC, and let P(j\k) be the set of transition probabilities characterizing this 
channel (where j G {0, . . . , J — 1} and k G {0, . . . , K — 1} designate the channel output and input, respectively). 
For an arbitrary list decoder whose list size is limited to L, the maximal error probability (Pq max 

) satisfies 



Pe,max ^ 6Xp 



-AT I E sp (R 




where R = ln y~ ' is the rate of the code, P m i n designates the smallest non-zero transition probability of the DMC, 
the parameter e is an arbitrarily small positive number, and the function P sp is given by 

E sp (R) 4 sup (S (p) - PR) (18) 

£o(p)=maxE (p,q) (19) 
q 

-1 K-l 



£ (p,q) = -m mj>*P(j|^ 



(20) 



i+p 

\j=0 ~ k=0 

The maximum in the RHS of (|T9T > is taken over all probability vectors q = (q , . . . , qK-i), i-C-> over all q with K 
non-negative components summing to 1. 

The reason for considering fixed composition codes in [25] is that, in general, the optimal probability distribution 
f s may depend on the composition of the codewords through the choice of the parameter s in (0, 1) (see [25, p. 96]). 

The next step in the derivation of the SP67 bound is the application of Theorem 12.21 towards the derivation 
of a lower bound on the maximal block error probability of an arbitrary block code. This is performed by lower 
bounding the maximal block error probability of the code by the maximal block error probability of its largest fixed 
composition subcode. Since the number of possible compositions is polynomial in the block length, one can lower 
bound the rate of the largest fixed composition subcode by R — O (^r) where R is the rate of the original code. 
Clearly, the rate loss caused by considering this subcode vanishes when the block length tends to infinity; however, 
it loosens the bound for codes of short to moderate block lengths. Finally, the bound on the maximal block error 
probability is transformed into a bound on the average block error probability by considering an expurgated code 
which contains half of the codewords of the original code with the lowest decoding error probability. This finally 
leads to the SP67 bound in [25, Part 1]. 
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Theorem 2.3 (The 1967 Sphere-Packing Bound for Discrete Memoryless Channels): [25, Theorem 2]. Let C be 
an arbitrary block code whose transmission takes place over a DMC. Assume that the DMC is specified by the set 
of transition probabilities P(j\k) where k G {0, . . . , K — 1} and j G {0, . . . , J — 1} designate the channel input 
and output alphabets, respectively. Assume that the code C forms a set of M codewords of length N (i.e., each 
codeword is a sequence of N letters from the input alphabet), and consider an arbitrary list decoder where the size 
of the list is limited to L. Then, the average decoding error probability of the code C satisfies 



P e {N,M,L) > exp<^ -N 



E sp [ R 



N 



where R 



lnf — ) 

— Kf-L and the error exponent E sp (R) is introduced in £[8 



The terms 



Ox 



In AT 
~N~ 
1 



KhxN 




(21) 



scale like and the inverse of the square root of N, respectively (hence, they both vanish as we let N tend to 
infinity), and P m j n denotes the smallest non-zero transition probability of the DMC. 



B. Recent Improvements on the 1967 Sphere-Packing Bound 

In [35], Valembois and Fossorier revisit the derivation of the SP67 bound, focusing this time on finite-length 
block codes. They present four modifications to the classical derivation in [25] which improve the pre-exponent 
of the SP67 bound. The new bound derived in [35] is also valid for memoryless channels with discrete input and 
continuous output (as opposed to the SP67 bound which is only valid for DMCs). It is applied to the binary-input 
AWGN channel, and is also compared with the SP59 bound which holds for any set of equal energy signals 
transmitted over the AWGN channel; this comparison shows that the recent bound in [35] provides an interesting 
alternative to the SP59 bound, especially for high code rates. In this section, we outline the improvements suggested 
in [35] and present the resulting bound. 

The first modification suggested in [35] is the addition of a free parameter in the derivation of the lower bound 
on the decoding error probability of two-word codes; this free parameter is used in conjunction with Chebychev's 
inequality, and it is optimized in order to get the tightest bound within this form. 

A second improvement presented in [35] is related to a simplification in [25, Part 1] where the inequality 
s \fk^(s) ^ hi ( ^/p ) * s a PP ue d. This bound on the second derivative of p results in no asymptotic loss, but 
loosens the bound on the decoding error probability for short to moderate block lengths. By using the exact value of 
p," instead, the tightness of the resulting bound is further improved in [35]. This modification also makes the bound 
suitable to memoryless channels with continuous output, as it is no longer required that P m i n is positive. It should 
be noted that this causes a small discrepancy in the derivation of the bound; the derivation of a lower bound on the 
block error probability which is uniform over all fixed composition codes relies on finding the composition which 
minimizes the lower bound. The optimal composition is given in [25, Eq. 4.18, 4.19] for the case where the upper 
bound on p!' is applied. In [35], the same composition is used without checking whether it is still the composition 
which minimizes the lower bound. However, as we see in the next section, for the class of symmetric memoryless 
channels the value of the bound is independent of the code composition; therefore, the bound of Valembois and 
Fossorier [35, Theorem 7] (referred to as the VF bound) stays valid. This class of channels includes all memoryless 
binary-input output-symmetric (MBIOS) channels. 

A third improvement in [35] concerns the particular selection of the value of p > which leads to the derivation 
of Theorem 12.31 In [25], p is set to be the value p which maximizes the error exponent of the SP67 bound (i.e., 
the upper bound on the error exponent). This choice emphasizes the similarity between the error exponents of the 
SP67 bound and the random coding bound, hence proving that the error exponent of the SP67 bound is tight for all 
rates above the critical rate of the channel. In order to tighten the bound for the finite-length case, [35] chooses the 
value of p to be p* which provides the tightest possible lower bound on the decoding error probability. For rates 
above the critical rate of the channel, the asymptotic accuracy of the original SP67 bound implies that as the block 
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length tends to infinity, p tends to p*. However, for codes of finite block length, this simple observation tightens 
the bound with almost no penalty in the computational complexity of the resulting bound. 

The fourth observation made in [35] concerns the final stage in the derivation of the SP67 bound. In order to get a 
lower bound on the maximal block error probability of an arbitrary block code, the derivation in [25] considers the 
maximal block error probability of a fixed composition subcode of the original code. In [25], a simple lower bound 
on the size of the largest fixed composition subcode is given; namely, the size of the largest fixed composition 
subcode is not less than the size of the entire code divided by the number of possible compositions. Since the 
number of possible compositions is equal to the number of possible ways to divide N symbols into K types, this 
value is given by ^k^ 1 )- To simplify the final expression of the SP67 bound, [25] applies the upper bound 
( K-i^) < N K . Since this expression is polynomial is the block length N, there is no asymptotic loss to the 
error exponent. However, by using the exact expression for the number of possible compositions, the bound in 
[35] is tightened for codes of short to moderate block lengths. Applying these four modifications in [35] yields 
an improved lower bound on the decoding error probability of block codes transmitted over memoryless channels 
with finite input alphabets. As mentioned above, these modifications also extend the validity of the new bound to 
memoryless channels with discrete input and continuous output. However, the requirement of a finite input alphabet 
still remains, as it is required in order to apply the bound to arbitrary block codes, and not only to fixed composition 
codes. The VF bound [35] is given in the following theorem: 

Theorem 2.4 (Improvement on the 1967 Sphere-Packing Bound for Discrete Memoryless Channels): [35, Theo- 
rem 7]. Under the assumptions and notation used in Theorem 12.31 the average decoding error probability satisfies 

P e (N,M,L) > exp{-NE sp (R,N)} 

where 

E SV (R,N) 4 8v^{Eo(f x )-p x (R-0 1 (^,x}\ +0 2 (-^=,x )Ac )} 
x> ~ 

\ N J N N N K 



„ ( 1 \ A 8 \^ ( 2 ), , In 8 In (2-4) 



k=0 

J-1 



Pj,k,p 



P{j\k) 

v ( k\p) AJ 



'* yH >- J-i 

R. . 



3=0 

J -I a 

,2 < J 3\k,p 



P(j\k) 



^(p) = ^-n — H L, (p)l 

3=0 

/K-1 \P 
P jAp ^ P(j\k)TT- P . £ q k ,, p P(j\k')^ 

\k'=0 J 

where q p = (qi jP , ■ ■ ■ , Qk,p) designates the input distribution which maximizes Eq(p, q) in (fT9l ). and the parameter 
p = p x is determined by solving the equation 



F fe=0 p \ 

For a more detailed review of the improvements suggested in [35], the reader is referred to [23, Section 5.4]. 



2 K ~ l 



N 

k=o 
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III. An Improved Sphere-Packing Bound 

In this section, we derive an improved lower bound on the decoding error probability which utilizes the sphere- 
packing bounding technique. This bound is valid for symmetric memoryless channels with a finite input alphabet, 
and is referred to as an improved sphere-packing (ISP) bound. We begin with some necessary definitions and basic 
properties of symmetric memoryless channels which are used in this section for the derivation of the ISP bound. 

A. Symmetric Memoryless Channels 

Definition 3.1: A bijective mapping g : J —> J where J C M. d is said to be unitary if for any integrable 
generalized function / : J — > M 

/ f(x)dx= [ f(g(x))dx (23) 
J J J J 

where by generalized function we mean a function which may contain a countable number of shifted Dirac delta 
functions. If the projection of J over some of the d dimensions is countable, the integration over these dimensions 
is turned into a sum. 

Remark 3.1: The following properties also hold: 

1) If g is a unitary mapping so is g^ 1 . 

2) If J is a countable set, then g : J — > J is unitary if and only if g is bijective. 

3) Let J be an open set and g : J —> J be a bijective function. Denote 

g(xi,. . . ,x d ) = (gi{xi, . . .,x d ), . . .,g d (x x , . . -,x d )) 

and assume that the partial derivatives exist for all i, j G {1,2, ... ,d}. Then g is unitary if and only if 
the Jacobian satisfies |J(x)| = 1 for all x G J. 

Proof: The first property follows from d23l and by defining f(x) = f[g~ 1 (x)); this gives 

f f(g~\x))dx= [ f(x)dx= [ f(g(x))dx= [ /(Or 1 o g){x))dx = [ f(x)dx. 
J J J J J J J J J J 

The second property stems from the fact that for countable sets, the integral is turned into a sum, and the equality 

jej j&J 

holds by changing the order of summation. Finally, the third property is proved by a transform of the integrator in 
the LHS of d23j) from x = (x 1 , . . . ,x d ) to (gi(x x , . . .,x d ), . . .,g d {x x ,. . .,x d )). ■ 

We are now ready to define K-ary input symmetric channels. The symmetry properties of these channels are 
later exploited to improve the tightness of the sphere-packing bounding technique and derive the ISP lower bound 
on the average decoding error probability of block codes transmitted over these channels. 

Definition 3.2 (Symmetric Memoryless Channels): A memoryless channel with input alphabet K, = {0, 1, . . . , K — 
1}, output alphabet J C.W 1 (where K, d G N) and transition probability (or density if J non-countable) P(-\-) is 
said to be symmetric if there exists a set of unitary (bijective) mappings {gk}k=o wri ere gt '■ J — > J for all k G /C 
such that 

VyeJ, fcG/C P(y|0)=i>(<7 fc (y)|fc) (24) 

and 

Vfci, k 2 G K, g k l o g k2 = g {k2 _ kl)moAK . (25) 

Remark 3.2: From d24l . the mapping go is the identity mapping. Assigning k\ = k and k 2 = in ( f25T ) gives 

VA; G K g^ 1 = g(- k ) mo dK = 9K-k ■ (26) 
The class of symmetric memoryless channels, as given in Definition 13.21 is quite large. In particular, it contains 

the class of MBIOS channels. To show this, we employ the following proposition (see [21, Section 4.1.4]): 

Proposition 3.1: An arbitrary MBIOS channel can be equivalently represented as a binary symmetric channel 

(BSC) whose crossover probability is i.i.d., independent from the channel input, and observed by the receiver. The 
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crossover probability is given by p = where L denotes the random variable of the log-likelihood ratio at the 

channel output. 

We now apply Proposition 13.11 to show that any MBIOS channel is a symmetric memoryless channel, according to 
Definition 13.21 

Corollary 3.1: An arbitrary MBIOS channel, can be equivalently represented as a symmetric memoryless channel. 
Proof: Let us consider an MBIOS channel <t. Applying Proposition 13.11 £ can be equivalently represented 
by a channel whose output alphabet is J = {0, 1} x [0, 1]; here, the first term of the output refers to the BSC 
output and the second term is the associated crossover probability. We now show that this equivalent channel is a 
symmetric memoryless channel. To this end, it suffices to find a unitary mapping gi : J — » J such that 

VyGJ P(y|0) =P(gi(y)|l) (27) 

and g{ X = Q\ (i-e., g\ is equal to its inverse). 

For the channel <£ ', the conditional probability distribution (or density) function of the output y = (m,p) (where 
m E {0, 1} and p E [0, 1]) given that i E {0, 1} is transmitted, is given by 

P(y\i) = l^\- (1 - p) * i = 1 (28) 

I \P) • P it i = m 

where P is a distribution (or density) over [0, 1] and m designates the logical not of m. From (1281 ), we get that the 
mapping gi(m,p) = (m, p) satisfies (|27T ). Additionally, g^ 1 = g\ since m = m. Therefore, the proof is completed 
by showing that g\ is a unitary mapping. For any (generalized) function / : J — > E we have 

1 ri 



/ /(x)cZx = V / f(m,p)dp 

J j ±r n Jo 

f(m,p)dp 



m=0 ' 
1 rl 



o 



m=0 ' 

/(5i(x))<2x 

1 J 

where the second equality holds by changing the order of summation; hence g\ is a unitary function. ■ 
Remark 3.3: Proposition 13. ll forms a special case of a proposition given in [36, Appendix I]. Using the proposition 
in [36, Appendix I], which refers to M-ary input channels, it can be shown in a similar way that all M-ary input 
symmetric output channels, as defined in [36], can be equivalently represented as symmetric memoryless channels. 
M-ary PSK modulated signals transmitted over the AWGN channel and coherently detected at the receiver form 
another example of a symmetric memoryless channel. In this case, J is defined to be M 2 and is a clockwise 
rotation by where the determinant of the Jacobian is equal in absolute value to 1. 



B. Derivation of an Improved Sphere-Packing Bound for Symmetric Memoryless Channels 

In this section, we derive an improved sphere-packing lower bound on the decoding error probability of block 
codes transmitted over symmetric memoryless channels. To keep the notation simple, we derive the bound under 
the assumption that the communication takes place over a symmetric DMC. However, the derivation of the bound is 
justified later for the general class of symmetric memoryless channels with discrete or continuous output alphabets. 
Some remarks are given at the end of the derivation. 

Though there is a certain parallelism to the derivation of the SP67 bound in [25, Part 1], our analysis for symmetric 
memoryless channels deviates considerably from the derivation of this classical bound. The improvements suggested 
in [35] are also incorporated into the derivation of the bound. We show that for symmetric memoryless channels, the 
derivation of the sphere -packing bound can be modified so that the intermediate step of bounding the maximal error 
probability for fixed composition codes can be skipped, and one can directly consider the average error probability 
of an arbitrary block code. To this end, the first step of the derivation in [25] (see Theorem 12. 11 here) is modified 
so that instead of bounding the error probability when a single pair of probability assignments is considered, we 
consider the average error probability over M pairs of probability assignments. 
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1) Average Decoding Error Probability for M Pairs of Probability Assignments: We start the analysis by 
considering the average decoding error probability over M pairs of probability assignments, denoted {P™, P^}^ =1 , 
where it is assumed that the index to of the pair is chosen uniformly at random from the set {1, . . . , M} and is 
known to the decoder. Denote the observation by y and the observation space by y. For simplicity, we assume that 
y is a finite set. Following the notation in [25], we define the LLR for the m th pair of probability assignments as 

Dm (y) = ln ( (29) 



pm( y ) 



and the probability distribution 




pm( \l-s pm( \s 

O m (\) = 1 Ky> 2 yy> < S < 1 (30) 

For the m th pair, we also define the function fi" 1 as 

(y'rj , 0<«<i. (31) 

Let us assume that /i m and its first and second derivatives w.r.t. s are independent of the value of m, and therefore 
we can define \x = fi 1 = fi 2 = . . . = [i M . 

Remark 3.4: Note that in this setting, the requirement that [i m is independent of m inherently yields that all 
its derivatives are also independent of to. However, in the continuation, we will let P™ be a function of s and 
differentiate /x m w.r.t. s while holding P™ fixed. In this setting, we will show that for the specific selection of P{" 
and which are used to derive the new lower bound on the average block error probability, if the communication 
takes place over a symmetric memoryless channel then [i m and its first two derivatives w.r.t. s are independent of 
m. Also note that the fact that fi m is independent of m does not imply that PJJ 1 is independent of to. 
Based on the assumption above, it can be easily verified (in parallel to (l9l)-(fT2l) that for all to e {1, ... , M} 

y!{s) = { f i m )'(s)=K QT {D m (y)) (32) 

//'(«) = ( M m )"( S )=Var Qr (Z) m (y)) (33) 

Pf(y) = exp( M ( S ) - sD m (y)) Q-(y) (34) 

P 2 m (y) = exp[n(s) + (l-8)D m {y))Q?(y) (35) 

where Eq and Varg stand, respectively, for the statistical expectation and variance w.r.t. a probability distribution 
Q. For the m th code book, we define the set of typical output vectors as 

^ = {ye^:p m (y)-/x , ( s )|<x V / V 7 W}, x > o. (36) 



In the original derivation of the SP67 bound in [25] (see (U31) here), the parameter x was set to one; similarly to 
[35], this parameter is introduced in (l36l ) for enabling to tighten the bound for finite-length block codes. However, 
in both [25] and [35], only one pair of probability assignments was considered. By applying Chebychev's inequality 
to (l36l ). and relying on the equalities in (l32l) and (l33l) . we get that for all to G {1, ... , M} 

ye3C' x 

where this result is meaningful only for x > 

Let y™ and yip be the decoding regions of and P™, respectively. Since the index to is known to the 
decoder, P{" is decoded only against P 2 m ; hence, y™ and y™ form a partition of the observation space y. We now 
derive a lower bound on the conditional error probability given that the correct hypothesis is the first probability 
assignment and the m th pair was selected. Similarly to (fl4l) . we get the following lower bound from (|34l and d36l ): 

P% > exp - sfj{8) - sx ^) ^ Q™(y). (38) 
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Following the same steps w.r.t. the conditional error probability of Pg 1 an ^ applying < HH >, gives 

yeyrrw 

Averaging d38l ) and 091 over m gives that for all s S (0, 1) 

1 M Af 

p ^7 - m E p e m i > ex P (m*) - sn'(s) - s x yv 7 ^)) -J2 E Wy) (4°) 

m=l m=i ye y 2 "«nx m ' x 

and 

M M 

P e7 = ^ E P e m 2 > exp^W + (1 - a)pl{s) - (1 - S ) x v 7 ^)) ^ £ £ Q7(y) (41) 
m=l m=l yej;™ n K*' x 

where P e 7 and P^ g refer to the average error probabilities given that the first or second hypotheses, respectively, 
of a given pair are correct where this pair is chosen uniformly at random among the M possible pairs of hypotheses. 
Since for all m, the sets y™ and yip form a partition of the set of output vectors y, then 

M 1 M M 1 

F E E ™ + E <™ = ^E E ™>i-^ 

m =i y ey? n 3T' X m=l yG^ 2 m D 3C' X m=l y e3C' x 

where the last transition follows from (l37l) and is meaningful for x > Hence, at least one of the terms in the 
LHS of the above equality is necessarily greater than 5 (l — 2^2)- Combining this result with (l4Qb and (|4TI ). we 
get that for every s £ (0, 1) 

^e7 > Q - ^2) ex P ( M ( S ) - S/ /(s) - s x TV 7 ^)) (42) 

^7 > Q " ^2) exp( M ( S ) + (1 - S ) M '( S ) - (1 - a) x V^)). (43) 

The two inequalities above provide a lower bound on the average decoding error probability over M pairs of 
probability assignments. We now turn to consider a block code which is transmitted over a symmetric DMC. 
Similarly to the derivation of the SP67 bound in [25], we use the lower bound derived in this section to relate 
the decoding error probability when a given codeword is transmitted to the size of the decision region associated 
with this codeword. However, the bound above allows us to directly consider the average block error probability; 
this is in contrast to the derivation in [25] which first considered the maximal block error probability of the code 
and then used an argument based on expurgating half of the bad codewords in order to obtain a lower bound on 
the average error probability of the expurgated code (where the code rate is asymptotically not affected as a result 
of this expurgation). Additionally, we show that when the transmission takes place over a memoryless symmetric 
channel, one can consider directly an arbitrary block code instead of starting the analysis by referring to fixed 
composition codes as in [25, Part 1] and [35]. 

2 ) Lower Bound on the Decoding Error Probability of General Block Codes: We now consider a block code C 
of length N with M codewords, denoted by {x m }^f =1 ; assume that the transmission takes place over a symmetric 
DMC with transition probabilities P(j\k), where k G K, = {0, . . . , K — 1} and j G J = {0, . . . , J — 1} designate 
the channel input and output alphabets, respectively. In this section, we derive a lower bound on the average block 
error probability of the code C under an arbitrary list decoder where the size of the list is limited to L. Let /jv be a 
probability measure defined over the set of length- N sequences of the channel output, and which can be factorized 
as in ( fT6l ). We define M pairs of probability measures {-Pf \ P™} by 

P 1 m (y) = Pr(y|x m ), P^(y) = fN(y), m G {1,2,... ,M} (44) 

where x m is the m th codeword of the code C. Combining (|3TT > and (1441 . the function fi m takes the form 



or 



„ m {s) 



ln^Pr(y|x m ) 1 - s /jv(y)^ , < s < 1. (45) 
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Let us denote by qV} the fraction of appearances of the letter k in the codeword x m . By assumption, the communi- 
cation channel is memoryless and the function fjy is a probability measure which is factorized according to (fT6l ). 
Hence, for every m G {1, 2, . . . , M}, the function fj, m (s) in ( f45T > is expressible in the form 

K-l 

H m (s)=NY J €»k{s) (46) 

fc=0 

'j-i 

= In [ E p 01*) 1_ V0r ] > < S < 1. (47) 

,i=o 



where 



In order to apply the bound in (1421 ) and (1431 ). it is required that the function [i m and its first and second derivatives 
w.r.t. s are independent of the index m. From d46l ), it suffices to show that /ij. and its first and second derivatives 
are independent of the input symbol k. To this end, for every s E (0, 1), we choose the function / to be f s , as 
given in [25, Eqs. (4.18)-(4.20)]. Namely, for < s < 1, let q s = {qo :S , • • • , Qk-i,s} satisfy the inequalities 

Ew) 1_ '«i7^E a ^; yk (48) 

3=0 j=0 



where 



The function / = f s is given by 



a ; 



K-l 

A 



E ^^(ilfc') 1 " 5 - (49) 



fc'=0 



a,- „ 

W) = j_i ■ je{o J-i}. (50) 

i'=o 

Note that the input distribution q s is independent of the code C, as it only depends on the channel statistics. It 
should also be noted that since the bound in (|42~1) and (l43l) holds for every s G (0, 1), P™ and P™ are in general 
allowed to depend on s. However, the differentiation of the function /j, m w.r.t. s is performed while holding Pf 1 
and Pjj™ fixed. The following lemma shows that for symmetric channels, the function f s in (l50l ) yields that and 
its first and second derivatives w.r.t. s (while holding f s fixed) are independent of the input symbol k. 

Lemma 3.1: Let P(-|-) designate the transition probability function of a symmetric DMC with input alphabet 
K, = {0, . . . ,K — 1} and output alphabet J = {0, . . . , J — 1}, and let be defined as in (l47"T ). where / = f s is 
given in (l50l) . Then, the following properties hold for all s £ (0, 1) 

(51) 

(52) 
(53) 

where Po is introduced in (fT9l ) and the differentiation in (l52l and (l53l is performed w.r.t s while holding f s fixed. 
Proof: The proof of this lemma is quite technical and is given in Appendix A. ■ 

Remark 3.5: Since the differentiation of the function ^ w.r.t. s is performed while holding f = f s fixed, 
then the independence of the function ^ hi the parameter k, as stated in (IBTT) . does not necessarily imply the 
independence of the first and second derivatives of ^ as in (l52l and (l53l) : in order to prove this lemma, we rely on 
the symmetry of the memoryless channel. The function fiQ in ((U) and its derivatives are calculated in Appendix B 
for some symmetric memoryless channels, and these results are later used for the numerical calculations of the 
sphere -packing bounds in Section IVl 

By (l46l) and Lemma [3TT1 we get that the function \i m and its first and second derivatives w.r.t. s are independent 
of the index m (where this property also follows since X^^o 1 = 1> irrespectively of m). Hence, the lower bound 
derived in Section IIII-B.ll can be applied. 





= MlO0 = - 


■ ■ = fJ-K- 


■lis) 


Mo 00 


= m'iO) = •• 


■ ■ = »'k- 


As) 


00 


= m'/OO = • 




As) 
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Let y m be the decision region of the codeword x m . By associating y m and y^ with the two decision regions 
for the probability measures P™ and P™, respectively, we get 

pm = J- Pf(y) = Yl P^(y|x m ) = Pe,m 

and 

p e m 2 = E p 2 m (y) = E /^(y) = F ^ 

yey m yey m 

where P e m is the decoding error probability of the code C when the codeword x m is transmitted, and F(y m ) is 
a measure for the size of the decoding region y m as defined in (fTTT ). Substituting the two equalities above in (l42l 
and (03]) gives that for all s £ (0, 1) 

M 



M 

or 



g E P °,™ = P eT > (\ ~ 4^2) «p(m00 - - s x TV 7 ^)) (54) 

m=l ^ ' 

1 M /l 1 \ 

M E W»0 = P e7 > (2 " 4^J ex P^( fi ) + ( X - S ^'(s) ~a-s)x y 7 ^)) (55) 



m=l 

M 



where x > ^ and PsCVm) — Syey /iV,s(y)- Similarly to [25], we relate E ^s(iVm) to the number of codewords 

m=l 

M and to the size of the decoding list which is limited to L. First, for all < s < 1 

M M 

EW™) = E E /^(y)< L 

m=l m=l y£y m 

where the last inequality holds since each y G J N is included in at most L subsets {y m }m=i anc ^ a l so Sy /jv,s(y) = 
1. Hence, the LHS of d55l ) is upper bounded by for all < s < 1. Additionally, the LHS of (l54b is equal by 
definition to the average block error probability P c of the code C. Therefore, (l54l and (I55T ) can be rewritten as 

Pe > Q - exp(^( S ) - s/x'00 - s x vV'(s)) (56) 

or 

M > U - 4^2) ex P {v(s) + (1 - a)M'(a) - (1 - s) x W'(4 (57) 
Applying (l46l ) and Lemma [3TT1 to (l56l ) and (I57T ) gives that for all s G (0, 1) 



Pe> Q"4^) e *P I N U>(«, f s ) - 8^(s, f s )-sx J " j j (58) 

G - 4^2 ) ex P { ^ /•) + - S K( S ' /•) - (1 - *) ^ \J 2 ^ fs) ) j • (59) 



or 

L 

A lower bound on the average block error probability can be obtained from (1581) by substituting any value of 
s G (0, 1) for which the inequality in d59l ) does not hold. In particular we choose a value s = s x such that the 
inequality in (|59l ) is replaced by equality, i.e., 

1 exp(-iVP) 



M 

G ~ 4^) ex v\ N (^ fsj + (1 - «*) m'o(%> /-j - (1 - a.) * A /M%52 



(60) 



ln(— ) 

where P = designates the code rate in nats per channel use. Note that the existence of a solution s = s x 

to (l60l) can be demonstrated in a similar way to the arguments in [25, Eqs. (4.28)-(4.35)] for the non-trivial case 
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where the sphere -packing bound does not reduce to the trivial inequality P e > 0. This particular value of s is 
chosen since for a large enough value of N, the RHS of (1581 ) is monotonically decreasing while the RHS of (|59l ) 
is monotonically increasing for s £ (0, 1); thus, this choice is optimal for large enough N. The choice of s = s x 
also allows to get a simpler representation of the bound on the average block error probability. Rearranging (l60l ) 
gives 



Mol 5 ^ fs 



1 



1 



R + Ho(Sx,fs 



1 

Ax 2 



+ x 



iV 



Substituting s 



s x and the last equality into (1581) yields that 

r> . > \~ / Vo(sx,/ ; 

-P e > exp < N 




By applying (BTI ) and defining /3 X — 



jf^- we get 
P e > exp I -N E (p x ) - p x 



In 4 In (2-^) 

^" at + — at - 



^ ^WTT . ln_l In (2 ±)\\ 

V N N N If 

Note that the above lower bound on the average decoding error probability holds for an arbitrary block code of 
length N and rate R. The selection of p x is similar to [35]. Finally, we optimize over the parameter x £ (^,oo) 
in order to get the tightest lowest bound of this form. 

The derivation above only relies on the fact that the channel is memoryless and symmetric, but does not rely 
on the fact that the output alphabet is discrete. As mentioned in Section III-BI the original derivation of the SP67 
bound in [25] relies on the fact that the input and output alphabets are finite in order to upper bound fjf'(s) by 

In I —t== ] ) where P m i n designates the smallest non-zero transition probability of the channel. This requirement 



was relaxed in [35] to the requirement that only the input alphabet is finite; to this end, the second derivative of the 
function p is calculated, thus the above upper bound on this second derivative is replaced by its exact value. The 
validity of the derivation for symmetric continuous-output channels is provided in the continuation (see Remark [379l >. 
This leads to the following theorem, which provides an improved sphere-packing lower bound on the decoding 
error probability of block codes transmitted over symmetric memoryless channels. 

Theorem 3.1 (An Improved Sphere-Packing (ISP) Bound for Symmetric Memoryless Channels): Let C be an ar- 
bitrary block code consisting of M codewords, each of length N. Assume that C is transmitted over a memoryless 
symmetric channel which is specified by the transition probabilities (or densities) P(j\k) where k G K = 
{0, . . . , K — 1} and j £ J C M. d designate the channel input and output alphabets, respectively. Assume an 
arbitrary list decoder where the size of the list is limited to L. The average decoding error probability satisfies 



Pe(N,M,L) > exp|- 



-NE SV (R,N)} 



where 



E sp {R,N) 4 sup 



{Eq( Px 



pjR-Oi 



N'' 



+ o 2 



X,Pa 



(61) 



the function Eq is introduced in (fT9l ). R = 4? ln( M- ), and 



Oi 

1 



(n ,: 



N 



N 
In 4 



L 

In (2 



N 



x,p) = s(p) x V^o( s (p)>/«0»)) + 



In 4 In (2-£) 



N 



N 



(62) 
(63) 
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Here, s(p) = j^—, the non-negative parameter p = p x in the RHS of (f6Tb is determined by solving the equation 



R-Ot^x) = -pa{s(p)J s{p) ) - (l- S (p))p' (s(p),f s(p) ) + {l-s(p)) x^j W{phU p) ) (W) 

and the functions po(s,f) and f s are defined in d471 ) and d50l ), respectively. 

Remark 3.6: The requirement that the communication channel is symmetric is crucial to the derivation of the 
ISP bound. One of the new concepts introduced here is the use of the channel symmetry to show that the function 
p m and its first and second derivatives w.r.t. s are independent of the codeword composition. This enables to tighten 
the VF bound in [35] by skipping the intermediate step which is related to fixed composition codes. Another new 
concept is a direct consideration of the average decoding error probability of the code rather than considering the 
maximal block error probability and expurgating the code. This is due to the consideration of M pairs of probability 
distributions in the first step of the derivation. Note that the bound on the average block error probability of M 
probability assignment pairs requires that p m and its first and second derivatives are independent of the index m; 
this property holds due to the symmetry of the memoryless communication channel. 

Remark 3. 7: In light of the previous remark where we do not need to consider the block error probability of 
fixed composition codes as an intermediate step, the ISP bound differs from the VF bound [35] (see Theorem 12.41 ) 

log f N + K ^ 1 \ 

in the sense that the term — is removed from Oi(^j^-,x) (see (l22l). Therefore, the shift in the rate of 

the error exponent of the ISP bound behaves asymptotically like 0\ (-^) instead of 0\ (^r) (see (|2TI ). (l22l and 
d62"l)). Additionally, the derivation of the VF bound requires expurgation of the code to transform a lower bound 
on the maximal block error probability to a lower bound on the average block error probability. These differences 
indicate a tightening of the pre-exponent of the ISP bound (as compared to the SP67 and VF bounds) which is 
expected to be especially pronounced for codes of small to moderate block lengths and also when the size of the 
channel input alphabet is large (as will be verified in Section [V). 

Remark 3.8: The rate loss as a result of the expurgation of the code by removing half of the codewords with the 
largest error probability was ignored in [35]. The term as it appears in the term 0\(^j^-,x) of [35, Theorem 7], 
should be therefore replaced by ^ (see d62l). 

Remark 3.9: The ISP bound is also applicable to symmetric channels with continuous output. When the ISP 
bound is applied to a memoryless symmetric channel with a continuous-output alphabet, the transition probability 
is replaced by a transition density function and the sums over the output alphabet are replaced by integrals. 
Note that these densities may include Dirac delta functions which appear at the points where the corresponding 
input distribution or the transition density function of the channel are discontinuous. Additionally, as explained in 
Appendix A, the statement in Lemma 13.11 holds for general symmetric memoryless channels. 

IV. The 1959 Sphere-Packing Bound of Shannon and Improved Algorithms for Its Calculation 

The 1959 sphere -packing (SP59) bound derived by Shannon [24] provides a lower bound on the decoding error 
probability of an arbitrary block code whose transmission takes place over an AWGN channel. We begin this section 
by introducing the SP59 bound in its original form, along with asymptotic approximations derived by Shannon 
[24] which facilitate the estimation of the bound for large block lengths. We then review a theorem, introduced by 
Valembois and Fossorier [35], presenting a set of recursive equations which simplify the calculation of this bound. 
Both the original formula for the SP59 bound in [24] and the recursive method in [35] perform the calculations 
in the probability domain; this leads to various numerical difficulties of over and under flows when calculating 
the exact value of the bound for codes of block lengths of N = 1000 or more. In this section, we present an 
alternative approach which facilitates the calculation of the SP59 bound in the logarithmic domain. This eliminates 
the possibility of numerical problems in the calculation of the SP59 bound, regardless of the block length. 

A. The 1959 Sphere -Packing Bound and Asymptotic Approximations 

Consider a block code C of length N and rate R nats per channel use per dimension. It is assumed that all 
the codewords are mapped to signals with equal energy (e.g., PSK modulation); hence, all the signals representing 
codewords lie on an iV-dimensional sphere centered at the origin, but finer details of the modulation used are 
not taken into account in the derivation of the bound. This assumption implies that every Voronoi cell (i.e., the 
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convex region containing all the points which are closer to the considered signal than to any other code signal) is 
a polyhedric cone which is limited by at most exp(A r i?) — 1 hyper planes intersecting at the origin. As a measure 
of volume, Shannon introduced the solid angle of a cone which is defined to be the area of the sphere of unit 
radius cut out by the cone. Since the Voronoi cells partition the space R , then the sum of their solid angles is 
equal to the area of an ^-dimensional sphere of unit radius. The derivation of the SP59 bound relies on two main 
observations: 

• Among the cones of a given solid angle, the lowest probability of error is obtained by the circular cone whose 
axis connect the code signal with the origin. 

• In order to minimize the average decoding error probability, it is best to share the total solid angle equally 
among the exp(NR) Voronoi regions. 

As a corollary of these two observations, it follows that the average block error probability cannot be below the one 
which corresponds to the case where all the Voronoi regions are circular cones centered around the code signals 
with a common solid angle which is equal to a fraction of exp(— NR) of the solid angle of M. N . The solid angle 
of a circular cone is given by the following lemma. 

Lemma 4.1 (Solid Angle of a Circular Cone [24]): The solid angle of a circular cone of half angle 9 in R N is 
given by 



iV-l 

Q n{0) = r(jV _^ / (sin, 



\N-2 



In particular, the solid angle of M. is given by 



2vr- 



2 1 

Theorem 4.1 (The 1959 Sphere-Packing (SP59) Bound [24]): Assume that the transmission of an arbitrary block 
code of length N and rate R (in units of nats per channel use per dimension) takes place over an AWGN channel 

No 
2 



with noise spectral density Then, under ML decoding, the block error probability is lower bounded by 



P e (ML)>P SPB (N,9,A) , A = ]J^ (65) 
where E s is the average energy per symbol, 9 G [0,7r] satisfies the inequality exp(-NR) < , 

NA 2 7T 

P sm (N,9,A) 4 y N ~ i y [ 2 (sin^f" 2 f N (VNAcos4>) dcp + Q{^NA) (66) 
V2vr Je 

and 

In(x) ± 1 — z N - 1 exp [- Z - + zx) dz , V x G R, N G N. (67) 

2 — r(£±i) Jo V 2 / 

By assumption, the transmitted signal is represented by a point which lies on the iV-dimensional sphere of radius 
\JNE S and which is centered at the origin, and the Gaussian noise is additive. The value -Pspb(-^> 6, A) in the RHS 
of ( f65T > designates the probability that the received vector falls outside the iV-dimensional circular cone of half angle 
9 whose main axis passes through the origin and the signal point which is represented by the transmitted signal. 
Hence, this function is monotonically decreasing in 9. The tightest lower bound on the decoding error probability 
is therefore achieved for 9\ (N, R) which satisfies 

r = exp(-NR). (68) 

U N (TT) 

The calculation of 9i(N,R) can become quite tedious. In order to simplify the calculation of the SP59 bound, 
Shannon provided in [24] asymptotically tight upper and lower bounds on the ratio . 

Lemma 4.2 (Bounds on the Solid Angle [24]): The solid angle of a circular cone of half angle 9 in the Euclidean 
space M. N satisfies the inequality 

T(f ){sm9) N - 1 t a n 2 9\ Sl N (0) ^ T(f )(sin 9)^ 



2r(^±±)0Fcos# V N J ~ n N (n) ~ 2T(^)^cos9 ' 
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Corollary 4.1 (SP59 Bound (Cont.)): If 9* satisfies the equation 



' 1 77— = exp(-NR) (69) 



2r(^±±)^cosfl* V N 

fijv(7r) 



then ^ > exp(— NR), and therefore 



P e (ML) > P sm (N,6*,A). (70) 
The use of 9* instead of the optimal value Qi(N, R) causes some loss in the tightness of the SP59 bound. However, 
due to the asymptotic tightness of the bounds on q^t^ , this loss vanishes as N — ► oo. In [35], it was numerically 
observed that this loss is marginal even for relatively small values of NR; it was observed that this loss is smaller 
then 0.01 dB whenever the dimension of the code in bits is greater than 20, and it becomes smaller then 0.001 dB 
when the dimension exceeds 60 bits. 

For large block lengths, the calculation of the SP59 becomes practically difficult due to over and under flows in 
the floating-point operations. However, [24] presents some asymptotic formulas which give a good estimation of 
the bound for large enough block lengths. These approximations allow the calculation to be made in the logarithmic 
domain which eliminates the possibility of floating-point errors. 

Theorem 4.2: [24]. Defining 

A cos 9 + \/A 2 cos 2 9 + 4 



G{9) 
E L (0) 



2 

A 2 - AG{9) cos 9 - 2 ln(G(0) sin 9) 



y/N-1 -(a+i^+3 _ NEL{e) 



P ^ N ^ A ) ^ W(ATT) e ^~ e ■ (71) 



then 



This lower bound is valid for any block length N. However, the ratio of the left and right terms in (1711) stays 
bounded away from one for all N. 

A rather accurate approximation of Psp B (N,9,A) was provided by Shannon in [24], but without a determined 
inequality. As a consequence, the following approximation is not a proven theoretical lower bound on the block 
error probability. For N > 1000, however, its numerical values become almost identical to those of the exact bound, 
thus giving a useful estimation for the lower bound. 

Proposition 4.1: [24]. Using the notation of Theorem 14.21 if 9 > cot _1 ( J 4), then 

a (0) e -NE L (0) 



P SPB (N,9,A) 



/N 

where 



a{9) = tJtt(1 + G{9) 2 ) sin 9 (AG (6) sin 2 9 - cos 9 



-i 



B. A Recent Algorithm for Calculating the 1959 Sphere-Packing Bound 

In [35, Section 2], Valembois and Fossorier review the SP59 bound and suggest a recursive algorithm to simplify 
its calculation. This algorithm is presented in the following theorem: 

Theorem 4.3 (Recursive Equations for Simplifying the Calculation of the SP59 Bound): [35, Theorem 3]. The 
set of functions {/at} introduced in (|67T ) can be expressed in the alternative form 

f N (x) = P N {x) + Q N (x) exp(^) / exp(-^) dt , x G M, N € N (72) 

2 J -oo 2 

where Pn and Qn are two polynomials, determined by the same recursive equation for all N > 5 

2N - 5 + x 2 N-4 
Pn ( x ) = jyTi p N-2[x) - j^— - Pn-a(x) , 

27V - 5 + x 2 N - 4 

Q N ( X ) = — WZTi — Qn-2(x) - — — - Q N -a{x) (73) 
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with the initial conditions 

p 1 ( x ) = o, P 2 (*) = y / | p 3 (x) = |, p 4 (^) = y|^±^, 

Qi(x) = l, Q 2 (x) = J^x, Q 3 {x) = 1 , Q 4 (x) = y— • 

By examining the recursive equations for P/v an d Qiv m ( T73]) , it is noticed that the coefficients of the higher powers 
of x vanish exponentially as iV increases. When performing the calculation using double -precision floating-point 
numbers, these coefficients cause underflows when iV is larger than several hundreds, and are replaced by zeros. 
Examining the expression for Pspb(N,6, A) in d66l ), we observe that /n{x) (and therefore the polynomials Pn{x) 
and Qn{x)) is evaluated at x ~ 0(v~N). Hence, the replacement of the coefficients of the high powers of x by zeros 
causes a considerable inaccuracy in the calculation of P$pb in (l66l ). To exemplify the effect of these underflows, 
we study the coefficients of ^750(3^) as calculated using double precision floating-point numbers. In this case, the 
coefficients of all the powers higher than 400 have caused underflows and have been replaced by zeros. The left 
plot of Figure Q] shows the coefficients of Pi^x). Since /n(x) is evaluated at x ~ 0(y/N), one should examine 
the coefficients of Pj^q{x) = P 75o(\/750 x) which are plotted in the right plot of Figured! It can be seen that the 
dominant coefficients are those multiplying the powers of x between 400 and 520 which, as mentioned above, have 
been replaced by zeros due to underflows. This demonstrates the inaccuracy due to underflows in the coefficients 
of the high powers. To avoid this loss of dominant coefficients, it is possible to modify the recursive equations d73l ) 
in order to calculate the polynomials Pn(x) = Pn(VN x) and Qn(x) — Q{\fN x). However, as observed in the 
right plot of Figure [TJ these coefficients become extremely large and cause overflows when N approaches 1000. 




100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800 

Power of X Power of X 



Fig. 1. Coefficients of the polynomials Ptso(x) (left plot) and fVso(a;) = -P7so(\/750 x) (right plot). Since the polynomials are even, only 
the coefficients multiplying the even powers of x have been plotted. It can be observed that in the right plot, the coefficients of powers of 
x between 400 and 520 are dominant. These coefficients have caused underflows in the calculation of P75o(x) in the left plot. 



Considering the integrand in the RHS of d66l ) reveals another difficulty in calculating the SP59 bound for large 
values on N. For these values, the term /^(y/NAcoscf)) becomes very large and causes overflows, while the 
value of the term (sin^)^" 2 becomes very small and causes underflows; this causes a "0 • oo" phenomenon when 
evaluating the integrand at the RHS of (l66l) . 

C. A Log-Domain Approach for Computing the 1959 Sphere-Packing Bound 

In this section, we present a method which enables the entire calculation of the integrand in the RHS of (l66l ) in the 
log domain, thus circumventing the numerical over and under flows which become problematic in the calculation of 
the SP59 bound for large block lengths. We begin our derivation by representing the set of functions {/n} defined 
in (I67T ) as sums of exponents. 
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Proposition 4.2: The set of functions {/at} in (f6Tb can be expressed in the form 



N-l 



j=0 



where 



d(N,j,x) 



+ lnT 



lnT 



+ 1 -lnT(N -j) 



+(N-l-j)\n(y2x)-^ 



+ ln 



! , / lV7 - , j + l 
l + (-l) J 7(y.— 



TV € N, lEl 
j = 0,l...,JV-l 



and 



T(a) 



j(x,a) 



1 



t°- 1 e~*(ft , Re(o) > 

t a ~ l e~ l dt , xeR, Re(a) > 



(74) 

(75) 
(76) 



r(a) 7 

designate the complete and incomplete Gamma functions, respectively. 

Proof: The proof is given in Appendix C. ■ 
Remark 4.1: It is noted that the exponents d(N, j, x) in (l74l are readily calculated by using standard mathematical 
functions. The function which calculates the natural logarithm of the Gamma function is implemented in the 
MATLAB software by gammaln, and in the Mathematica software by LogGamma. The function 7(0, b) is 
implemented in MATLAB by gammainc (x, N) and in Mathematica by GammaRegularized (N, 0, x) . 
In order to perform the entire calculation of the function /jy in the log domain, we employ the function 



max * {x\ , . . . , x m ) = In ( 



m G N, x\, . . . ,x m G 



(77) 



which is commonly used in the implementation of the log-domain BCJR algorithm. The function max* can be 
calculated in the log domain using the recursive equation 

max*(xi, . . . ,x m+ i) = max*(max*(xi, . . . ,x m ),x m+ i) , m G N\ {1}, X\, . . . ,x m +\ G R 

with the initial condition 

max*(si, X2) = max(xi, x%) + In ^1 + e - ^ 1 "^ 2 '^ . 

Combining Proposition 14.21 and the definition of the function max * in d77"T ). we get a method of calculating the set 
of functions {/jv} in the log domain. 

Corollary 4.2: The set of functions {/n} defined in (l67l can be rewritten in the form 



f N {x) =exp max*(d(N,0,z),d(N,l,z),... ,d(N,N- l,x)) 



(78) 



where d( N, j, x) is introduced in (1741 . 

By combining (l66l ) and (|78T ). one gets the following theorem which provides an efficient algorithm for the 
calculation of the SP59 bound in the log domain. 

Theorem 4.4 (Log domain calculation of the SP59 bound): The term Pspb{N,9, A) in the RHS of d70l ) can be 
rewritten as 



p SPB (N,e,A) 



cxp 



NA 



1 



9 -ln(2vr) + (N - 2) In sin < 



ln(JV - 1) 



+ max * d(N, 0, V NA cos 0), ... , d(N, N-l, VNA cos 



+Q{VNA) , N G N, 9 G [0, |], A G M+ 
where d(iV, j, rc) is defined in (1741 . 

Using Theorem 14.41 it is easy to calculate the exact value of the SP59 lower bound for very large block lengths. 
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V. Numerical Results for Sphere-Packing Bounds 

This section presents some numerical results which serve to demonstrate the improved tightness of the ISP 
bound derived in Section [TlTJ We consider performance bounds for M-ary PSK block coded modulation with 
coherent detection over an AWGN channel, and for the binary erasure channel (BEC) which is MBIOS. As noted 
in Section IIII-Al these channels are symmetric and hence the ISP bound holds in these cases. For M-ary PSK 
modulated signals transmitted over the AWGN channel, the ISP bound is also compared with the SP59 bound 
revisited in Section [TV] and some upper bounds on the decoding error probability. We also compare these bounds 
to some computer simulations of iteratively decoded codes, and examine the tightness of these bounds w.r.t. the 
performance of modern error-correcting codes using practical decoding algorithms. 

A. Performance Bounds for M-ary PSK Block Coded Modulation over the AWGN Channel 

The ISP bound in Section InJJ is particularized here to M-ary PSK block coded modulation schemes whose 
transmission takes place over an AWGN channel, and where the received signals are coherently detected. For 
simplicity of notation, we treat the channel inputs and outputs as two dimensional real vectors, and not as complex 
numbers. Let M = 2 P (where p £ N) be the modulation parameter, denote the input to the channel by X = (x±, X2) 
where the possible input values are given by 

X fc = (cos^,sin^), flfc= ^J", 1 ^ , fc = 0,l,...,M-l. (79) 

M 

We denote the channel output by Y = (2/1,2/2) where Y = X + N, and N = (n-i, n 2 ) is a Gaussian random vector 
with i.i.d. components each with zero-mean and variance a 2 . The conditional pdf of the channel output, given the 
transmitted symbol X&, is given by 

p Y | X (Y|X fc ) = ^2 e-^^, YeK 2 (80) 

where ||-|| designates the L2 norm. The closed form expressions for the function fiQ and its first two derivatives 
w.r.t. s (while holding f s fixed) are derived in Appendix B.l and are used for the calculation of both the VF and ISP 
bounds. The SP59 bound [24] provides a lower bound on the decoding error probability for the considered case, 
since the modulated signals have equal energy and are transmitted over the AWGN channel. In the following, we 
exemplify the use of these lower bounds. They are also compared to the random-coding upper bound of Gallager 
[11], and the tangential-sphere upper bound (TSB) of Poltyrev [20] when applied to random block codes. This serves 
for the study of the tightness of the ISP bound, as compared to other upper and lower bounds. The numerical results 
shown in this section indicate that the recent variants of the SP67 bound provide an interesting alternative to the 
SP59 bound which is commonly used in the literature as a measure for the sub-optimality of codes transmitted 
over the AWGN channel (see, e.g., [9], [14], [17], [23], [29], [35], [37]). Moreover, the advantage of the ISP bound 
over the VF bound in [35] is exemplified in this section. 

Figure [2] compares the SP59 bound [24], the VF bound [35], and the ISP bound derived in Section [III] The 
comparison refers to block codes of length 500 bits and rate 0.8 ehan b n '^ use which are BPSK modulated and transmitted 
over an AWGN channel. The plot also depicts the random-coding bound of Gallager [11], the TSB ([13], [20]), 
and the capacity limit bound (CLB)Q It is observed from this figure that even for relatively short block lengths, the 
ISP bound outperforms the SP59 bound for block error probabilities below 10 _1 (this issue will be discussed later 
in this section). For a block error probability of 10 -5 , the ISP bound provides gains of about 0.26 and 0.33 dB 
over the SP59 and VF bounds, respectively. For these code parameters, the TSB provides a tighter upper bound 
on the block error probability of random codes than the random-coding bound; e.g., the gain of the TSB over the 
Gallager bound is about 0.2 dB for a block error probability of 10 -5 . Note that the Gallager bound is tighter than 
the TSB for fully random block codes of large enough block lengths, as the latter bound does not reproduce the 
random-coding error exponent for the AWGN channel [20]. However, Figure [2] exemplifies the advantage of the 
TSB over the Gallager bound, when applied to random block codes of relatively short block lengths; this advantage 
is especially pronounced for low code rates where the gap between the error exponents of these two bounds is 

'Although the CLB refers to the asymptotic case where the block length tends to infinity, it is plotted in [35] and here as a reference, in 
order to examine whether the improvement in the tightness of the ISP is for rates above or below capacity. 
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Fig. 2. A comparison between upper and lower bounds on the ML decoding error probability for block codes of length N = 500 bits 
and code rate of 0.8 bil ! . . This fi sure refers to BPSK modulated signals whose transmission takes place over an AWGN channel. The 

channel use ° or 

compared bounds are the 1959 sphere-packing (SP59) bound of Shannon [24], the Valembois-Fossorier (VF) bound [35], the improved 
sphere-packing (ISP) bound derived in Section [III] the random-coding upper bound of Gallager [11], and the TSB [13], [20] when applied 
to fully random block codes with the above block length and rate. 



marginal (see [23, p. 67] and [32]), but it is also reflected from Figure |2] for BPSK modulation with a code rate 
of 0.8 ehan b n "j usg . The gap between the TSB and the ISP bound, as upper and lower bounds respectively, is less than 
1.2 dB for all block error probabilities lower than 1CT 1 . Also, the ISP bound is more informative than the CLB 
for block error probabilities below 8 • 10~ 3 while the SP59 and VF bounds require block error probabilities below 
1.5 • 10~ 3 and 5 • 1CT 4 , respectively, to outperform the capacity limit. 

Figure [3] presents a comparison of the SP59, VF and ISP bounds referring to short block codes which are QPSK 
modulated and transmitted over the AWGN channel. The plots also depict the random-coding upper bound, the 
TSB and CLB; in these plots, the ISP bound outperforms the SP59 bound for all block error probabilities below 
4 • 1CT 1 (this result is consistent with the upper plot of Figure |7]). In the upper plot of Figure [3l which corresponds 
to a block length of 1024 bits (i.e., 512 QPSK symbols) and a rate of L5 ch J^ use , it is shown that the ISP bound 
provides gains of about 0.25 and 0.37 dB over the SP59 and VF bounds, respectively, for a block error probability 
of 10~ 5 . The gap between the ISP lower bound and the random-coding upper bound is 0.78 dB for all block error 
probabilities lower than 10 _1 . In the lower plot of Figure [3] which corresponds to a block length of 300 bits and a 
rate of 1.8 h *? , the ISP bound significantly improves the SP59 and VF bounds; for a block error probability of 
10~ 5 , the improvement in the tightness of the ISP over the SP59 and VF bounds is 0.8 and 1.13 dB, respectively. 
Additionally, the ISP bound is more informative than the CLB for block error probabilities below 3- 10~ 3 , where the 
SP59 and VF bound outperform the CLB only for block error probabilities below 3- 10~ 6 and 5- 10 -8 , respectively. 
For fully random block codes of length N = 300 and rate 1.8 — which are QPSK modulated with Gray's 
mapping and transmitted over the AWGN channel, the TSB is tighter than the random-coding bound (see the lower 
plot in Figure [3] and the explanation referring to Figure [2]). The gap between the ISP bound and the TSB in this 
plot is about 1.5 dB for a block error probability of 10~ 5 (as compared to gaps of 2.3 dB (2.63 dB) between the 
TSB and the SP59 (VF) bound). 

Figure @] presents a comparison of the bounds for codes of block length 5580 bits and 4092 information bits, where 
both QPSK (upper plot) and 8-PSK (lower plot) constellations are considered. The modulated signals correspond to 
2790 and 1680 symbols, respectively, so the code rates for these constellations are 1.467 and 2.2 bits per channel 
use, respectively. For both constellations, the two considered SP67-based bounds (i.e., the VF and ISP bounds) 
outperform the SP59 for all block error probabilities below 2 • 10 _1 ; the ISP bound provides gains of 0.1 and 
0.22 dB over the VF bound for the QPSK and 8-PSK constellations, respectively. For both modulations, the gap 
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Fig. 3. A comparison between upper and lower bounds on the ML decoding error probability, referring to short block codes which are 
QPSK modulated and transmitted over the AWGN channel. The compared lower bounds are the 1959 sphere-packing (SP59) bound of 
Shannon [24], the Valembois-Fossorier (VF) bound [35], and the improved sphere-packing (ISP) bound; the compared upper bounds are the 
random-coding upper bound of Gallager [11] and the tangential-sphere bound (TSB) of Poltyrev [20]. The upper plot refers to block codes 
of length N = 1024 which are encoded by 768 information bits (so the rate is 1.5 ch J^ ), and the lower plot refers to block codes of 
length N = 300 which are encoded by 270 bits whose rate is therefore l-8 dl Jjj^ . 



between the ISP lower bound and the random-coding upper bound of Gallager does not exceed 0.4 dB. In [6], 
Divsalar and Dolinar design codes with the considered parameters by using concatenated Hamming and accumulate 
codes. They also present computer simulations of the performance of these codes under iterative decoding, when 
the transmission takes place over the AWGN channel and several common modulation schemes are applied. For a 
block error probability of 1CT 4 , the gap between the simulated performance of these codes under iterative decoding, 
and the ISP lower bound, which gives an ultimate lower bound on the block error probability of optimally designed 
codes under ML decoding, is approximately 1.4 dB for QPSK and 1.6 dB for 8-PSK signaling. This provides an 
indication on the performance of codes defined on graphs and their iterative decoding algorithms, especially in 
light of the feasible complexity of the decoding algorithm which is linear in the block length. To conclude, it is 
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E b /N D [dB] 

Fig. 4. A comparison of upper and lower bounds on the ML decoding error probability for block codes of length iV = 5580 bits and 
information block length of 4092 bits. This figure refers to QPSK (upper plot) and 8-PSK (lower plot) modulated signals whose transmission 
takes place over an AWGN channel; the rates in this case are 1.467 and 2.200 chim b n "[ ux , respectively. The compared bounds are the 1959 
sphere-packing (SP59) bound of Shannon [24], the Valembois-Fossorier (VF) bound [35], the improved sphere-packing (ISP) bound, and 
the random-coding upper bound of Gallager [11]. 



reflected from the results plotted in Figure |4] that a gap of about 1.5 dB between the ISP lower bound and the 
performance of the iteratively decoded codes in [6] is mainly due to the imperfectness of these codes and their 
sub-optimal iterative decoding algorithm; this conclusion follows in light of the fact that for random codes of the 
same block length and rate, the gap between the ISP bound and the random coding bound is reduced to less than 
0.4 dB. 

While it was shown in Section [III] that the ISP bound is uniformly tighter than the VF bound (which in turn is 
uniformly tighter than the SP67 bound [25]), no such relations are shown between the SP59 bound and the recent 
improvements on the SP67 bound (i.e., the VF and ISP bounds). Figure [5] presents regions of code rates and block 
lengths for which the ISP bound outperforms the SP59 bound and the CLB; it refers to BPSK modulated signals 
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Fig. 5. Regions in the two-dimensional space of code rate and block length, where a bound is better than the two others for three different 
targets of block error probability (P e ). The figure compares the tightness of the 1959 sphere-packing (SP59) bound of Shannon [24], the 
improved sphere-packing (ISP) bound, and the capacity-limit bound (CLB). The plot refers to BPSK modulated signals whose transmission 
takes place over the AWGN channel, and the considered code rates lie in the range between 0.1 and l ch J?S ■■ 




Code rate [bit/channel use] Code rate [bit/channel use] 



Fig. 6. Regions in the two-dimensional space of code rate and block length, where a bound is better than the two others for three 
different targets of block error probability (P c ). The figure compares the tightness of the 1959 sphere-packing (SP59) bound of Shannon 
[24], the capacity-limit bound (CLB), and the Valembois-Fossorier (VF) bound [35] (left plot) or the improved sphere-packing (ISP) bound 
in Section [HI] (right plot). The plots refer to BPSK modulated signals whose transmission takes place over the AWGN channel, and the 
considered code rates lie in the range between 0.70 and 1 . , — . 

D channel use 



transmitted over the AWGN channel and considers block error probabilities of 10~ 4 , 10~ 5 and 10~ 6 . It is reflected 
from this figure that for any rate < R < 1, there exists a block length N = N(R) such that the ISP bound 
outperforms the SP59 bound for block lengths larger than N(R); the same property also holds for the VF bound, 
but the value of N(R) depends on the considered SP67-based bound, and it becomes significantly larger in the 
comparison of the VF and SP59 bounds. It is also observed that the value N(R) is monotonically decreasing with 
R, and it approaches infinity as we let R tend to zero. An intuitive explanation for this behavior can be given 
by considering the capacity limits of the binary-input and the energy-constrained AWGN channels. For any value 
< C < 1, denote by Eb '^^ and Eh '^^ tne values of Jjt required to achieve a channel capacity of C bits per 
channel use for the binary-input and the energy-constraint AWGN channels, respectively (note that in the latter 
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Fig. 7. Regions in the two-dimensional space of code rate and block length, where a bound is better than the two others for different targets 
of block error probability (P c ). The figure compares the tightness of the 1959 sphere-packing (SP59) bound of Shannon [24], the improved 
sphere-packing (ISP) bound, and the capacity-limit bound (CLB). The plots refer to QPSK (left plot) and 8-PSK (right plot) modulated 



signals whose transmission takes place over the AWGN channel; the considered code rates lie in the range between 1.4 and 2- 



for 



the QPSK modulated signals and between 2.1 and 3- 



for the 8-PSK modulated signals. 



case, the input distribution which achieves capacity is also Gaussian). For any < C < 1, clearly ^"^^ > Eh ^ c ^ ■ 
however, the difference between these values is monotonically increasing with the capacity C, and, on the other 
hand, this difference approaches zero as we let C tend to zero. Since the SP59 bound only constrains the signals to 
be of equal energy, it gives a measure of performance for the energy-constrained AWGN channel, where the SP67- 
based bounds consider the actual modulation and therefore refer to the binary-input AWGN channel. As the code 
rates become higher, the difference in the ultimate performance between the two channels is larger, and therefore 
the SP67-based bounding techniques outperform the SP59 bound for smaller block lengths. For low code rates, 
the difference between the channels is reduced, and the SP59 outperforms the SP67-based bounding techniques 
even for larger block lengths due to the superior bounding technique which is specifically tailored for the AWGN 
channel. Figure [6] presents the regions of code rates and block lengths for which the VF bound (left plot) and 
the ISP bound (right plot) outperform the CLB and the SP59 bound when the signals are BPSK modulated and 
transmitted over the AWGN channel; block error probabilities of 1CT 4 , 10~ 5 and 10~ 6 are examined. This figure 
is focused on high code rates, where the performance of the SP67-based bounds and their advantage over the SP59 
bound is most appealing. From Figure [6j we have that for a code rate of 0.75 bits per channel use and a block 
error probability of 10 -6 , the VF bound becomes tighter than the SP59 for block lengths exceeding 850 bits while 
the ISP bound reduces this value to 450 bits; moreover, when increasing the rate to 0.8 bits per channel use, the 
respective minimal block lengths reduce to 550 and 280 bits for the VF and ISP bounds, respectively. Fig |7] shows 
the regions of code rates and block lengths where the ISP outperforms the CLB and SP59 bounds for QPSK (left 
plot) and 8-PSK (right plot) modulations. Comparing the lower plot of Figure [6] which refers to BPSK modulation 
with the upper plot of Figure [7] which refers to QPSK modulation, one can see that the two graphs are identical 
(when accounting for the doubling of the rate which is due to the use of both real and imaginary dimensions in the 
QPSK modulation). This is due to the fact that QPSK modulation poses no additional constraints on the channel 
and in fact, the real and imaginary planes can be serialized and decoded as in BPSK modulation. However, this 
property does not hold when replacing the ISP bound by the VF bound; this is due to the fact that the VF bound 
considers a fixed composition subcode of the original code and the increased size of the alphabet causes a greater 
loss in the rate for QPSK modulation. When comparing the two plots of Figure |7J it is evident that the minimal 
value of the block length for which the ISP bound becomes better than the SP59 bound decreases as the size of 
the input alphabet is increased (when the rate is measured in units of information bits per code bit). An intuitive 
justification for this phenomenon is attributed to the fact that referring to the constellation points of the M-ary PSK 
modulation, the mutual information between the code symbols in each dimension of the QPSK modulation is zero, 
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while as the spectral efficiency of the PSK modulation is increased, the mutual information between the real and 
imaginary parts of each signal point is increased; thus, as the spectral efficiency is increased, this poses a stronger 
constraint on the possible positioning of the equal-energy signal points on the iV-dimensional sphere. This intuition 
suggests an explanation for the reason why as the spectral efficiency is increased, the advantage of the ISP bound 
over the SP59 bound (where the latter does not take into account the modulation scheme) holds even for smaller 
block lengths. This effect is expected to be more subtle for the VF bound since a larger size of the input alphabet 
decreases the rate for which the error exponent is evaluated (see d22ll). 

B. Performance Bounds for the Binary Erasure Channel 




Channel erasure probability 

Fig. 8. A comparison of the improved sphere-packing (ISP) lower bound from Section [Til] and the exact decoding error probability of 
random binary linear block codes under ML decoding where the transmission takes place over the BEC (see [5, Eq. (3.2)]). The code rate 
examined is 0.75 ^'g use and the block lengths are N = 1024, 2048,4096, 8192 and 16384 bits. 

In recent years, the BEC has been the focus of much attention in the field of iterative decoding techniques. 
The simplicity of this channel and the absolute reliability of the known values at the output lend themselves to 
a one-dimensional analysis of turbo-like codes and the performance of their iterative decoding algorithms in the 
case where the codes are transmitted over the BEC (see, e.g., [27]). For the asymptotic case where we let the 
block length tend to infinity, several families which achieve the capacity of the BEC under iterative decoding have 
been constructed; these include low-density parity-check (LDPC) [16], irregular repeat-accumulate (RA) [18] and 
accumulate -repeat-accumulate (ARA) codes [19]; several ensembles of IRA and ARA codes were demonstrated to 
achieve the capacity of the BEC with bounded complexity per information bit, in contrast to LDPC codes without 
puncturing whose decoding complexity necessarily becomes unbounded as the gap to capacity vanishes (see [22], 
[18], [19]). These discoveries motivate a study of the performance of iteratively decoded codes defined on graphs 
for moderate block lengths (see, e.g., [33]). In Figure [8j we compare the ISP lower bound and the exact block error 
probability of random linear block codes transmitted over the BEC as given in [5, Eq. (3.2)]. The figure refers to 
codes of rate 0.75 bits per channel use and various block lengths. It can be observed that for a block length of 
1024 bits, the difference in the channel erasure probability for which the random coding bound and the ISP bound 
achieve a block error probability of is 0.035 while for a block length of 16384 bits, this gap is decreased to 
0.009. This yields that the ISP bound is reasonably tight, and also suggests that this bound can be used in order to 
assess the imperfectness of turbo-like codes even for moderate block lengths. 
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C. Minimal Block Length as a Function of Performance 

In a wide range of applications, the system designer needs to design a communication system which fulfills 
several requirements on the available bandwidth, acceptable delay for transmitting and processing the data while 
maintaining a certain fidelity criterion in reconstructing the data (e.g., the block error probability needs to be 
below a certain threshold). In this setting, one wishes to design a block code which satisfies the delay constraint 
(i.e., the block length is limited) while adhering to the required performance over the given channel. By fixing 
the communication channel model, code rate (which is related to the bandwidth expansion caused by the error- 
correcting code) and the block error probability, sphere -packing bounds are transformed into lower bounds on the 
minimal block length required to achieve the desired block error probability at a certain gap to capacity using an 
arbitrary block code and decoding algorithm. Similarly, by fixing these parameters, the random coding bound of 
Gallager [11] is transformed into an upper bound on the block length required for ML decoded random codes to 
achieve a desired block error probability on a given communication channel. 

In this section, we consider some practically decodable codes taken from some recent papers ([1], [7], [8], [28], 
[30], [34]). We examine the gap between channel capacity and the for which they achieve a required block error 
probability as a function of the block length of these codes. The performance of these specific codes together with 
their practical decoding algorithms is compared with the sphere -packing and random coding bounds; these bounds 
serve here as lower and upper bounds, respectively, on the block length required to achieve a given block error 
probability and code rate on a given channel using an optimal block code and decoding algorithm. This comparison 
shows how far, in terms of delay, some modern error-correcting codes with their sub-optimal and practical decoding 
algorithms are from the fundamental limitations imposed by information theory. 




Block length [Bit] 



Fig. 9. Bounds on the block length required to achieve a block error probability of 10~ 5 compared with the performance of some practically 
decodable codes. The considered communication channel model is the BPSK modulated AWGN channel and the rate of the codes is 0.5 
bits per channel use. The depicted bounds are the 1959 sphere-packing (SP59) bound of Shannon [24], the improved sphere-packing (ISP) 
bound introduced in Section[In] and the random coding upper bound of Gallager [11]. The codes are taken from [34] (code 1), [8] (codes 
2 and 4) and [7] (code 3). 

Figure [9] considers some block codes of rate ^ bits per channel use which are BPSK modulated and transmitted 
over the AWGN channel. The plot depicts the gap to capacity in dB for which these codes achieve block error 
probabilities of 10~ 4 and 1CT 5 under their practical decoding algorithms as a function of their block length. As a 
reference, this figure also plots lower bounds on the block length which stem from the SP59 and ISP bounds and 
the upper bound on the block length of random codes which stems from the random-coding bound of Gallager. 
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The three bounds refer to a block error probability of 10~ 5 . The code labeled 1 is a block code of length 192 bits 
which is decoded using a near-ML decoder by applying 'box and match' decoding techniques [34]. It is observed 
that this code outperforms random coding upper bound for ML decoded random codes with the same block length 
and code rate. It is also observed that this code achieves a block error probability of 10~ 5 at a gap to capacity of 
2.76 dB while the SP59 bound gives that the block length required to achieve this performance is lower bounded 
by 133 bits (so the bound is very informative). The codes labeled 2, 3 and 4 are prototype-based LDPC codes of 
lengths 2048, 5176 and 8192 bits, respectively (codes 2 and 4 are taken from [8] and code 3 is taken from [7]). 
These codes achieve under iterative decoding a block error probability of 10~ 5 at gaps to capacity of 1.70, 1.27 
and 1.07 dB, respectively. In terms of block length, the gap between the performance of these codes under iterative 
decoding and the SP59 lower bound on the block length required to achieve a block error probability of 10~ 5 at 
these channel conditions is less than one order of magnitude. It is also noted that throughout the range of block 
lengths depicted in Figure |9j the gap between the lower bound on the block length of optimal codes which stems 
from the better of the two sphere -packing bounds and the upper bound on the block length of random codes is 
less than one order of magnitude. This exemplifies the tightness of the sphere-packing bounds when used as lower 
bounds on the block lengths of optimal codes. 
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Fig. 10. Bounds on the block length required to achieve a block error probability of 10~ 5 compared with the performance of some 
practically decodable codes. The considered communication channel model is the BPSK modulated AWGN channel and the rate of the codes 
is 0.5 bits per channel use. The depicted bounds are the improved sphere-packing (ISP) bound introduced in Section [III] and the random 
coding upper bound of Gallager [11]. Codes 1,2,3 and 4 are taken from [1], [7], [28] and [30], respectively. 

Figure [10] considers some LDPC codes of rate 0.88 bits per channel use which are BPSK modulated and 
transmitted over the AWGN channel. The gap to capacity in dB for which these codes achieve block error 
probabilities of 10 -4 and 10~ 5 under iterative decoding is plotted as a function of block length. As in Figure |9j 
the figure uses lower and upper bounds on the block length which stem from the ISP and random-coding bounds, 
respectively. For the rate and block lengths depicted, the SP59 bound is universally looser than the ISP bound 
and hence it is not shown. The bounds refer to a block error probability of 10~ 5 . For the examined block error 
probabilities, the depicted codes require a gap to capacity of between 0.63 and 1.9 dB. For this range of 
the lower bound on the block lengths which is derived from the ISP bound is looser than the one given by the 
SP59 bound. However, both bounds are not very informative in this range. For cases where the gap to capacity is 
below 0.5 dB, the difference between the lower bound on the block length of optimal codes which stems from the 
ISP bound and the upper bound on the block length of random codes is less than one order of magnitude. Code 
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number 1 is an LDPC of length 1448 bits whose construction of is based on balanced incomplete block designs 
[1]. This code achieves a block error probability of 10~ 5 at a gap to capacity of 1.9 dB while the random-coding 
bound shows that the block length required to achieve this performance using random codes is upper bounded by 
600 bits. The code labeled 2 is a prototype-based LDPC code of length 5176 bits which is taken from [7]. Code 
number 3 is a quasi-cyclic LDPC code of length 16362 bits taken from [28]. These code achieve under iterative 
decoding a block error probability of 10~ 5 at gaps to capacity of 1.02 and 0.86 dB, respectively. In terms of block 
length, the gap between the performance of these codes under iterative decoding and the upper bound on the block 
length of random codes which achieve a block error probability of 10~ 5 under the same channel conditions is less 
than one order of magnitude. The code labeled 4 is a finite-geometry LDPC code of length 279552 bits which is 
taken from [30]. For this code we only have the gap to capacity required to achieve a block error probability of 
10~ 4 , however, it is clear that the difference in block length from the random coding upper bound becomes quite 
large as the gap to capacity is reduced. 

By fixing the block length and considering the gap in AE^/Nq between the performance of the specific codes 
and the sphere-packing bounds in Figures [9] and \\0\ it is observed that the codes considered in these plots achieve 
exhibit gaps of 0.2 - 0.6 dB w.r.t. the information-theoretic limitation provided by the sphere-packing bounds (with 
the exception of code 1 in Figure [TOl which exhibits a gap of about 1.25 dB). In this respect we also mention 
some high rate turbo-product codes with moderate block lengths (see [4]) exhibit a gap of 0.75 - 0.95 dB w.r.t. the 
information-theoretic limitation provided by the ISP bound. Based on numerical results in [31] for the ensemble of 
uniformly interleaved (1144, 1000) turbo-block codes whose components are random systematic, binary and linear 
block codes, the gap in ^ between the ISP lower bound and an upper bound under ML decoding is 0.9 dB for 
a block error probability of 10~ 7 . These results exemplify the strength of the sphere-packing bounds for assessing 
the theoretical limitations of block codes and the power of iteratively decoded codes (see also [9], [14], [15], [23], 
[35]). 

VI. Summary 

This paper presents an improved sphere -packing (ISP) bound for finite-length block codes whose transmission 
takes place over symmetric memoryless channels. The improvement in the tightness of the bound is especially 
pronounced for codes of short to moderate block lengths, and some of its applications are exemplified in this paper. 
The derivation of the ISP bound was stimulated by the remarkable performance and feasible complexity of turbo- 
like codes with short to moderate block lengths. We were motivated by recent improvements on the sphere -packing 
bound of [25] for finite block lengths, as suggested by Valembois and Fossorier [35]. 

We first review the classical sphere-packing bounds, i.e., the 1959 sphere-packing bound (SP59) derived by 
Shannon for equal-energy signals transmitted over the Gaussian channel [24], and the 1967 sphere -packing (SP67) 
bound derived by Shannon, Gallager and Berlekamp for discrete memoryless channels [25]. The ISP bound, 
introduced in Section [Till is uniformly tighter than the classical SP67 bound [25] and the bound in [35]. 

We apply the ISP bound to M-ary PSK block coded modulation schemes whose transmission takes place over an 
AWGN channel and the received signals are coherently detected. The tightness of the ISP bound is exemplified by 
comparing it with upper and lower bounds on the ML decoding error probability and also with reported computer 
simulations of turbo-like codes under iterative decoding. The paper also presents a new algorithm which performs 
the entire calculation of the SP59 bound in the logarithmic domain, thus facilitating the exact calculation of the 
SP59 bound for all block lengths without the need for asymptotic approximations. It is shown that the ISP bound 
suggests an interesting alternative to the SP59 bound, where the latter is specialized for the AWGN channel. 

In a wide range of applications, one wishes to design a block code which satisfies a known delay constraint (i.e., 
the block length is limited) while adhering to a required performance over a given channel model. By fixing the 
communication channel model, code rate and the block error probability, sphere-packing bounds are transformed 
into lower bounds on the minimal block length required to achieve the desired block error probability at a certain 
gap to capacity using an arbitrary block code and decoding algorithm. Comparing the performance of specific codes 
and decoding algorithms to the information-theoretic limitations provided by the sphere-packing bounds, enables 
one to deduce how far in terms of delay is a practical system from the fundamental limitations of information- 
theory. Further details on the comparison between practically decodable codes and the sphere -packing bounds are 
found in Section IV-CI 
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The ISP bound is especially attractive for block codes of short to moderate block lengths; this is especially 
pronounced for high rate codes (see Figs. [3]-|7]). Its improvement over the SP67 bound and the bound in [35, 
Theorem 7] is also more significant as the input alphabet of the considered modulation is increased. 
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Appendix A: Proof of Lemma |3~T1 

We consider a symmetric DMC with input alphabet JC = {0, . . . , K — 1}, output alphabet J = {0, . . . J — 1} 
(where J, K G N) and transition probabilities P. Let {gk}^ = o be the set of unitary functions which satisfy the 
conditions (l24b and (|25T ) in Definition 13.21 To prove Lemma 13.11 we start with a discussion on the distribution q s 
which satisfies (|48T ). 

a ) On the input distribution q s for symmetric DMCs: 
Lemma A. 1: For symmetric DMCs and an arbitrary value of s G (0,1), the uniform distribution q^ s 
k G K, satisfies d48l ) in equality. 

Proof: To prove the lemma, it is required to show that 



Efor 



J-i 



K-l 



E pm 1 -" E ^m 1 



J-l /K-l 



3=0 



\k'=0 



E E^wH < 



(A.l) 



j=0 \k'=0 



Let us consider some k G KL. Examining the left-hand side (LHS) of (IA.1I ) gives 



J-i 



K-l 



E ^i^ 8 E ^(ii^) 1 



3=0 



\k'=0 



j=0 I \fc'=0 / 



K-l J-l 



'K-\ 



\k'=0 



fc=0 3'=° 
A'-l J-l 

EE 

fc=0 J=° 



(6) 



X-l 



^(^(i)ifc) 1 - 5 E ^(ssWl*') 1 



(A.2) 



\fc'=0 



where (a) holds by summing over a dummy variable k G {0, 1, . . . , K — 1} instead of the multiplication by if in 
the previous line, and (6) holds since gjr is unitary for all k G {0, 1, . . . , K — 1} (see (l23l where the integral is 
replaced here by a sum). For all j G J and G {0, 1, . . . , K — 1}, the symmetry properties in (l24l ) - (|26l ) give 

P(9~ k U)\k)^P((g^o g ~)(j)\0) 



(&) 



(c) 



P\9(k-k)modK 



(j)l o) 



P(j| (ife - ^modif) 
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where (a) follows from (1241) . (b) relies on d25l ), and (c) follows from (1241) and d26l ). Similarly, for all j, £ 
{0,1,...,K-1} 

k'=0 k'=0 



k'=0 



where (a) relies on (IA.3b and (6) holds since when fc' takes all the values in {0, 1, ... , K—l}, so does (k' — k)modK. 
Substituting (1A.3I) and (IA.4b in (1A.2I) gives 

j=0 [ \fc'=0 / 

= EEi^'i( fc -^) mod ^) 1-s fE^i*') 1 -)' 
= e{ (e ^ic*-*)^) 1 "* ) fE^oi^') 1 - 

3=0 [\k=0 ) Vfc'=0 

J-l f /if-l \ ART-1 1 

5=0 I \fc =0 / \k'=0 



= E E^'H (A-5) 

j=0 \fc'=0 / 

where equality (a) holds since when the index fc takes all the values in {0, 1, . . . , K — 1}, so does (A; — fc)modi<C. 

■ 

We now turn to explore how the symmetry of the channel and the input distribution q s induce a symmetry on the 
probability tilting measure f s . 

b) On the symmetry of the tilting measure f s for strictly symmetric DMCs: 
Lemma A.2: For all s £ (0, 1), k £ fC and j £ J, the tilting measure f s in (f50b satisfies 

fs(j) = f s (9k(j)) (A.6) 
Proof: Examining the definition of / s in (|50l ), it can be observed that it suffices to show that 



,(,•),, , Vs £ (0,1), fc £ K, j £ J 



where a 3 - s is given in (|49l ). This equality is proved in (]A.4b referring to the uniform input distribution where 

1k,s = ^ or a ^ k £ K.. ■ 

Having established some symmetry properties of q s and f s , we are ready to prove equalities (BIT ) - (|53T ). 
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c) On the independence of Hk and its two derivatives from k: As we have shown, the uniform distribution 
q s satisfies (|48T ) in equality for all inputs, so 



Mfe ( s ) = In ^/'u/,r 7,(.,r 



(6) 



(l- S )m^> ijS )^ 



•ln( J] (ay,,) 



(A.7) 



where (a) follows from the choice of f s in d49t and (l50l) . (6) follows from Lemma |A~T1 and (l49h . and (c) follows 
from d49l ). Under the setting s = j^-, since the conditions on q s in (|48T ) are identical to the conditions on the input 
distribution q = q s which maximizes Eo(j^,q) as stated in [11, Theorem 4], then 



/J>k(s,fs) 



(1- S )ln f£ 

-(1 - s) 
-(1 - s) E 
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where £?o is given in (fl9l) . This proves (1511) - 

We now turn to prove the independence of the first two derivatives of \i k w.r.t s from k G JC. 

Remark A. 1: Note that the partial derivative of fik(s) w.r.t s is performed while holding f = f s constant. 

As is shown in [25], 
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For every k G K, P\ and P2 used in fj, k are defined to be P{-\k) and / s , respectively. Hence, for all k G K, 
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Applying (1241 and Lemma |A.2[ we get that for all k G K, 

J-l J-l 

f=0 j'=0 

J-l 

( = ] Y^ p u'\°r~ a MY (A.i2) 

j'=0 

where (a) follows from (1241 and Lemma [A- 21 and (6) follows since c/^ 1 is unitary. Substituting (IA.12I ) in dA. lib 
gives 
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(«) ^(jOIo) 1 "^^^)* 
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j'=0 

= Qo,s{9k\j)) ( A -!3) 

where (a) follows from (f24]>, (|26]>, (|50]>, (IA.6I ) and (IA. 12b . and (6) relies on the definition of Q fcjS in (IA. 11b - 
Similarly, 
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= D 4g-\j)) (A.14) 

where (a) follows from (1241 ). d26l and dA.61 ). and (6) relies on the definition of s in dA.101 ). Using dA. 13b and 
(IA.14b . we finally get for all k G K, 
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and 



4(a) = Var Qfcs (D M (j)) 
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= E^«fe 1 c?))(A),.(ft 1 (3) 

3=0 
J-l 

( fe ) ^ r-\fr> I -\\2 I / \2 



^o{sf 



Y,QoAj){DoAj)Y 



3=0 



where (a) and (b) follow since g^ 1 is unitary for all k E K. This completes the proof of Lemma [3TT1 

Remark A.2: Equalities (l5Tl)-(l53l hold for arbitrary symmetric memoryless channels. For a general output 
alphabet J C the proof of these properties follows the same lines as the proof here with the exception 
that the sums over J are replaced by integrals. As in Definition 13.11 if the projection of J over some of the d 
dimensions is countable, the integration over these dimensions is turned into a sum. 

Appendix B: Calculation of the Function /j, in (07]) for some Symmetric Channels 

This appendix presents some technical calculations which yield the expressions for the function fiQ defined in 
(l47"T ) and its first two derivatives w.r.t. s (while holding f s fixed in the calculation of the partial derivatives of /i 
w.r.t. s, as required in [25]). The examined cases are M-ary PSK modulated signals transmitted over an AWGN 
channel and binary block codes transmitted over the BEC. These expressions serve for the application of the VF 
bound in [35] and the ISP bound derived in Section HIT] to block codes transmitted over these channels. 



A. The M-ary PSK modulated AWGN channel 

For M-ary PSK modulated signals transmitted over the AWGN channel, the channel output is J = M 2 . In the 
case of a continuous output alphabet, the sums in (IA.7I ) are replaced by integrals, and the transition probabilities are 
replaced by transition probability density functions. Due to the symmetry of the channel, we get from Lemma IA.1I 
that the distribution q s which satisfies (|48T ) is uniform. Hence, we get by substituting (l80l ) into (IA.7I ) that 

, M-l i M ■ 

1 (l-s) (l|y-* fc |p-||y-x || 2 ) 




\ k=0 / 



Since \\xk\\ = 1 for all k G {0, 1, . . . , M — 1} we have 



lly - Xfc|| 2 - ||y - x || 2 = -2(y,x fc - x ) (B.l) 
and so fiQ can be rewritten in the form fio(s) = (1 — s) In (9(s)) where 

R2 V fc=o / 

We now turn to calculate the derivative of fiQ with respect to s while holding f = f s constant. Substituting (l80l ) 
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into the definition of f s in ( |50b , we get that f s is given by 
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where the last equality follows from (IB . 1 b and (IB.2I ). The log-likelihood ratio L>o,s in (IA. 10b is given by 
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where the second equality follows from (l80l ) and (IB.3I ). The distribution Qo iS in (IA. lib is given by 
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where (a) and (c) rely on dB.ll ), (6) follows from Lemma 2.1 in the proof for symmetric channels, and (d) relies 
on the definition of 6 in (IB.2b . Substituting (1B.4I) and (IB.5I) in dA.91 ) we get 
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B. The Binary Erasure Channel 

Let us denote the output of the channel when an erasure has occurred by £, and let p designate the erasure 
probability of the channel. Since the BEC is symmetric, the input distribution q s which satisfies (l48l) is uniform 
(see Lemma IA.1I) . and we get from (IA.7b 

IM>(s,f 9 ) = (l-s)m(^-^+p 
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We now turn to calculate f s for the BEC; substituting the transition probabilities into (l50b gives 
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Substituting dB.91 ) and ( IB. 101 ) into the definition of the distribution Qq :S in dA. lib gives 
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and the LLR in (1A.101 ) is given by 

A>,»(°) = In ' 



A),«(£) = In I 

Applying (IB. 1 lb and (IB. 121) we get from (IA. lib 

Vo( s Js) = Eq 0s (A), s ) 



,2(1 -p) + 2~ p, 

i \ 
2~ 

^2(l-p) + 2^p, 



l-p + 2i-p \2(l-p)+2~p / 

2^p , / 2^ 
H £1 t — In 1 



l-p + 2i-p \2(l- p) + 2~p) 
In | 2 ^ ln2 



l-p + 2 1 — p/ l-p + 2i-»p 1-s 



and 



// '( S ,/ S ) = E Qo , s (D 2 s )-^ ( S ,/ s ) 



2 



l-p + 2i-p \2(l-p) + 2~p/ 

+ - 2 % in 2 ( - 2 f ^ ) - /,)' 
l-p + 2i-p \2(l-p) + 2i-p/ 



l-p + 2 1 -«p/ 1 -p + 2i—p 1 - s Vl-p + 2 



2 



2-.p /ln 1) 

i-p + 2i-p V 1 - v 

2^p(l-p) / In 2 



1 -p + 2i : =p) 



t^ 2 VI -a 
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Appendix C: Proof of Proposition 14.21 
From the definition of fa in (|67T ), it follows that 



Jn(x) 



z N 1 exp(— — + zx) dz 



e 2 



2— r( 



JV-l 
z exp 



[Z — X) 



dz 



e 2 



2 ^ r( JV±i; 



(u + x) N 1 exp ( — — ) du 



u 



From the binomial formula, we get 



JV-l 



/jvO) 



e 2 -i 



N-l 

3 



jv-i-j 



it 



2— J=0 

We now examine the integrals in the RHS of dC.ll ). For odd values of j, we get 



v? exp ( — — ) du 



v? exp ( — —) du 



u 



u 3 exp I I du + u 3 exp I I du 



u 3 exp I ) du 



u J exp 



u 



du 



u J exp 



u 
~2 



du 



(CI) 



(C.2) 



where the second equality follows since the integrand is an odd function and the domain of first integral is symmetric 
around zero. For even values of j, we get 



u 3 exp ( — — ] du 



u 3 exp 



u 



ur 



du + u 3 exp 



du 



u 3 exp I — — ) du + / u 3 exp I — —J du 



u 



(C.3) 



where the second equality holds since the integrand is an even function. Combining (IC.2b and (IC.3I ) gives that for 
j< {"• 1 V 1} 



u 



u 3 exp ( — — ) du 



u 3 exp ( — — J du + (-1) 3 j u 3 exp ( — — I du 



(«) 



{2t) t ^e~ t dt+ (-1) 3 / 2 (2t) 2 r i e-* (it 



r^e~ l dt 



2 3-1 i 



1 + (-l)J' ^ 



t 2 e * ^ 



2 ^r ^' + 1 



, x 2 7 + 1 



where (a) follows by substituting t = \ and the functions T and 7 are defined in (|75T ) and d76l ). respectively. 
Substituting the last equality in (1C.11 ) and also noting that 



JV — 1 

j 



T(N) 



T(N-j)T(j + l) 



JVeN, {0,1,..., JV-l} 
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we get 



Jn(x) 



(a) 
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e 2 
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T(N) 
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3=0 
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3=0 



T(N) 
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r(iv-j) ^ r (f + 1 ) 2 " 



where (a) follows from the equality 



r(2n) = — _r(n)rfn+-j, n/0,--,-l,--, 



and (6) follows from the definition of d(N, j, x) in (|74 
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